Hydrogenase polypeptide and methods of use

ABSTRACT

Provided herein are polypeptides having hydrogenase activity. The polypeptide may be multimeric, and may have hydrogenase activity of at least 0.05 micromoles H 2  produced min −1  mg protein −1 . Also provided herein are polynucleotides encoding the polypeptides, genetically modified microbes that include polynucleotides encoding one or more subunits of the multimeric polypeptide, and methods for making and using the polypeptides.

CONTINUING APPLICATION DATA

This application claims the benefit of U.S. Provisional Application Ser. No. 61/005,383, filed Dec. 5, 2007, which is incorporated by reference herein.

GOVERNMENT FUNDING

The present invention was made with government support under Grant No. DE-FG02-05ER15710, awarded by the Department of Energy. The Government may have certain rights in this invention.

BACKGROUND

Molecular hydrogen (H₂) is typically produced by steam reforming of methane, and platinum is the most commonly used catalyst for hydrogen production. Due to utilization of fossil fuels as a source of methane, as well as the expense, limited availability, sensitivity to poisoning, and bioincompatibility of the catalyst, it is not likely to be utilized in economical energy conversion systems (Bharadwaj and Schmidt. 1995. Fuel Processing Technology 42:109-127, Ghenciu. 2002. Current Opinion in Solid State & Materials Science 6:389-399). However, in 2003 President Bush in the State of the Union Address proposed the Hydrogen Fuel Initiative, the goal of which was to develop new technologies for production and utilization of H₂ as a potential source of energy to replace fossil fuels. In microorganisms, the molecular machine responsible for the biological uptake and evolution of hydrogen is an enzyme known as hydrogenase. Hydrogenase catalyzes the simplest of chemical reactions, the interconversion of the neutral molecule H₂ and its elementary constituents, two protons and two electrons (Eqn. 1).

2H⁺+2e ⁻⇄H₂   (1)

Ironically, however, while the reaction that they catalyze is simple, hydrogenase enzymes are multimeric proteins and typically are sensitive to air (oxygen). This has to-date precluded the facile production of a recombinant form of the major class of hydrogenase, the so-called ‘nickel-iron’ (NiFe) type.

Hydrogenases are found in representatives of most microbial genera, as well as some unicellular eukaryotes (Adams et al. 1980. Biochim Biophys Acta 594:105-76; Cammack et al. 2001. Hydrogen as a fuel: learning from nature. Taylor & Francis, London, New York; Friedrich and Schwartz. 1993. Annual Review of Microbiology 47:351-383; Przybyla et al. 1992. FEMS Microbiology Reviews 88:109-135, Vignais et al. 2001. FEMS Microbiology Reviews 25:455-501). The enzyme allows many microorganisms to use H₂ gas as a source of low potential reductant (H₂/H⁺, E^(o′)=−420 mV), either for carbon fixation or as a source of energy. In aerobic environments, H₂ oxidation can be coupled via membrane electron transport to the reduction of oxygen (O₂/H₂O, E^(o′)=+820 mV). There are a variety of electron acceptors that can be coupled to anaerobic H₂ oxidation, including carbon dioxide, which can be reduced to either methane (by methanogens) or acetate (by acetogens), and sulfate and ferric-iron, which are reduced to sulfide and ferrous iron, respectively. On the other hand, microorganisms that produce H₂ during growth are widespread in anaerobic environments. The production of H₂ is used as a mechanism to dispose of the excess reductant that is generated during the oxidation of organic material. These fermentative organisms conserve energy by chemical synthesis (substrate level phosphorylation) independent of the means by which they dispose of reductant (be it as H₂ or as a reduced organic compound such as ethanol). However, it was recently discovered that some organisms are able to conserve energy directly from the production of H₂ by a novel respiratory mechanism (Sapra et al. 2003. Proc Natl Acad Sci USA 100:7545-50).

Two major types of hydrogenase are known: the nickel-iron (NiFe) and the iron-only (Fe) enzymes (Adams. 1990. Biochimica Et Biophysica Acta 1020:115-145; Albracht. 1994. Biochimica Et Biophysica Acta-Bioenergetics 1188:167-204), which are unrelated phylogenetically (Meyer, J. 2007. Cellular and Molecular Life Sciences 64:1063-1084; Vignais et al. 2001. FEMS Microbiology Reviews 25:455-501). The iron-only type is found in only a few types of anaerobic bacteria and some photosynthetic algae, but they have been extensively studied. This includes structural characterization (Chen et al. 2002. Biochemistry 41:2036-2043; Nicolet et al. 2001. Journal of the American Chemical Society 123:1596-1601; Nicolet et al. 2000. Trends in Biochemical Sciences 25:138-143; Nicolet et al. 1999. Structure with Folding & Design 7:13-23; Peters et al. 1998. Science 282:1853-1858) including potential active site models (Boyke et al. 2004. Journal of the American Chemical Society 126:15151-15160; Tye et al. 2006. Inorg Chem 45:1552-9; Zilberrnan et al. 2007. Inorg Chem 46:1153-61), and recently insights have been provided into their biosynthesis (Mishra et al. 2004. Biochemical and Biophysical Research Communications 324:679-685; Posewitz et al. 2004. Journal of Biological Chemistry 279:25711-25720), as well there are some recent successful attempts to make recombinant forms of these enzymes (King et al. 2006. J Bacteriol 188:2163-72).

The majority of microorganisms that metabolize H₂, however, contain NiFe-hydrogenases, an example of which is the cytoplasmic NiFe hydrogenase I of the hyperthermophilic archaeon, Pyrococcus furiosus, which grows optimally at 100° C. (Fiala and Stetter. 1986. Archives of Microbiology 145:56-61, Verhagen et al. 2001. Hyperthermophilic Enzymes, Pt A 330:25-30). The NiFe-hydrogenases have also been extensively characterized over the last 40 years, and several crystal structures are available (Garcin et al. 1998. Biochemical Society Transactions 26:396-401, Higuchi. 1999. Structure 7:549-56, Volbeda and Fontecilla-Camps. 2003. Dalton Transactions:4030-4038, Volbeda et al. 1996. Journal of the American Chemical Society 118:12989-12996). They all are made up of at least two subunits, one of which contains the NiFe-catalytic site, while the other contains three iron-sulfur (FeS) clusters. These clusters serve to shuttle electrons from the electron donor to the enzyme to and from the NiFe site in the catalytic subunit. The Ni atom is bound to four cysteinyl residues of this subunit, two of which are near the N-terminus and two near the C-terminus. Two of the four Cys bind a single Fe atom, which is also coordinated, remarkably, by one carbon monoxide (CO) and two cyanide (CN) ligands (Bagley et al. 1995. Biochemistry 34:5527-5535, Happe et al. 1997. Nature 385:126-126, Pierik et al. 1999. Journal of Biological Chemistry 274:3331-3337). These diatomic ligands serve to activate the iron atom (maintaining it in the low spin state) thereby facilitating catalysis. Interestingly, such ligands are also found at the active site of the iron-only hydrogenases (Nicolet et al. 2002. J Inorg Biochem 91:1-8), as well as the mononuclear iron site of a third type of hydrogenase found in a very limited number of archaea (Lyon et al. 2004. Journal of the American Chemical Society 126:14239-14248), an example of convergent evolution toward a similar function.

The hydrogenase of P. furiosus is of particular interest for additional reasons. First, it is obtained from an organism that grows optimally at 100° C. and has been shown to be an exceedingly robust and thermostable enzyme (Bryant and Adams. 1989. J Biol Chem 264:5070-9; Ma and Adams. 2001. Methods Enzymol 331:208-16). Second, in in vitro assays, the enzyme has been shown to be able to generate hydrogen gas by oxidizing NADPH in a reversible reaction (Ma and Adams. 2001. Methods Enzymol 331:208-16; Ma et al. 2000. J Bacteriol 182:1864-71; Ma et al. 1994. FEMS Microbiology Letters 122:245-250), which is a very rare property among the hydrogenases that have been characterized to date. Consequently, the reversible P. furiosus enzyme has utility in generating reductants such as NADPH. Likewise, the P. furiosus enzyme has utility in hydrogen production systems in which carbohydrates are oxidized to generate NADPH, which in turn can be converted to hydrogen gas by the hydrogenase. The production of hydrogen from glucose in an in vitro cell-free system using purified enzymes was first demonstrated over a decade ago (Woodward et al. 1996. Nat Biotechnol 14:872-4). This work was very recently extended in which the conversion of starch to hydrogen was described using an in vitro cell-free system made up of thirteen different enzymes (Zhang et al. 2007. PLoS ONE 2:e456). Twelve of the enzymes are used to oxidize starch and generate carbon dioxide and NADPH, and the thirteenth, P. furiosus hydrogenase, oxidizes NADPH and produces hydrogen gas. In this system, the hydrogenase was purified from P. furiosus biomass (Ma and Adams. 2001. Methods Enzymol 331:208-16) since a recombinant form of this enzyme was not available.

SUMMARY OF THE INVENTION

Provided herein are polypeptides having hydrogenase activity. In one aspect, the polypeptide is dimeric polypeptide. The amino acid sequence of the first subunit and the amino acid sequence of SEQ ID NO:6 have at least 80% identity, and the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:8 have at least 80% identity. At least one subunit may be a fusion that includes a heterologous amino acid sequence. The dimeric polypeptide may further include two more subunits to result in a tetrameric polypeptide. The amino acid sequence of the third subunit and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, and the amino acid sequence of the fourth subunit and the amino acid sequence of SEQ ID NO:4 have at least 80% identity. The multimeric polypeptide may be isolated, or purified. The tetrameric polypeptide may be present in a genetically modified microbial cell. In some aspects, the genetically modified microbial cell is not Pyrococcus furiosus, P. abyssi, P. horikoshii, Thermococcus kodakaraensis, or T. onnurineus. It may be present in a microbial cell, such as, but not limited to Escherichia coli.

The multimeric polypeptide may have hydrogenase activity of at least 0.05 micromoles H₂ produced min⁻¹ mg protein⁻¹ when isolated by centrifugation of a whole cell extract at 100,000×g, heat-treatment at 80° C. for 30 minutes, and re-centrifugation at 100,000×g. The heterologous amino acid sequence may be present at, for instance, the amino terminal end of a subunit, or the carboxy terminal end of a subunit. The multimeric polypeptide may include one or more chemically modified subunits. Also provided herein is a polypeptide consisting of two subunits or four subunits.

Also provided herein are genetically modified microbes. A genetically modified microbe may include an exogenous polypeptide, wherein the exogenous polypeptide includes two subunits. The first subunit includes an amino acid sequence, and the amino acid sequence of the first subunit and the amino acid sequence of SEQ ID NO:6 have at least 80% identity. The second subunit includes an amino acid sequence, and the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:8 have at least 80% identity. The two subunits form a dimeric polypeptide having hydrogenase activity. The dimeric polypeptide may further include two more subunits to form a tetrameric polypeptide having hydrogenase activity, wherein the third subunit includes an amino acid sequence, and the amino acid sequence of the third subunit and the amino acid sequence of SEQ ID NO:2 have at least 80% identity. The fourth subunit includes an amino acid sequence, and the amino acid sequence of the fourth subunit and the amino acid sequence of SEQ ID NO:4 have at least 80% identity. At least one subunit can be a fusion that includes a heterologous amino acid sequence. A genetically modified microbe may include one or more of the accessory polynucleotides described herein.

A genetically modified microbe may include two exogenous polynucleotides, wherein the exogenous polynucleotides each encode a subunit. The first subunit can include an amino acid sequence, and the amino acid sequence of the first subunit and the amino acid sequence of SEQ ID NO:6 have at least 80% identity. The second subunit can include an amino acid sequence, and the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:8 have at least 80% identity. The two subunits form a dimeric polypeptide having hydrogenase activity. The genetically modified microbe can further include two more exogenous polynucleotides, wherein the two more exogenous polynucleotides each encode a subunit. The third subunit can include an amino acid sequence, and the amino acid sequence of the third subunit and the amino acid sequence of SEQ ID NO:2 have at least 80% identity. The fourth subunit can include an amino acid sequence, and the amino acid sequence of the fourth subunit and the amino acid sequence of SEQ ID NO:4 have at least 80% identity. The four subunits form a tetrameric polypeptide having hydrogenase activity. At least one subunit can be a fusion that includes a heterologous amino acid sequence, such as a histidine tag.

Further provided herein are methods for making a polypeptide having hydrogenase activity. The methods may include providing a genetically modified microbe including exogenous polynucleotides as described herein, and incubating the microbe under conditions suitable for expression of the exogenous polynucleotides to produce a multimeric polypeptide having hydrogenase activity. The method may further include isolating, or optionally purifying, the polypeptide after the incubating.

Provided herein are methods for using a polypeptide having hydrogenase activity. The methods may include providing a polypeptide described herein, and incubating the polypeptide under conditions suitable for producing H₂. The produced H₂ may be collected.

In one aspect, the polypeptide is an isolated or purified polypeptide. The polypeptide may be present on a surface, such as one that conducts electricity, e.g., an anode. The polypeptide may be chemically modified. The incubating may include conditions that include a polysaccharide, such as a starch or a cellulose. The conditions can include a temperature of at least 37° C. or at least 70° C. 70° C.

In another aspect, the polypeptide is present in a genetically modified microbe. The incubating may include incubating the microbial cell under conditions suitable for the expression of the polypeptide. The incubating may include conditions that include a polysaccharide, such as a starch or a cellulose. The conditions can include a temperature of at least 37° C. or at least 70° C.

Provided herein are methods for using a polypeptide having hydrogenase activity. The methods for using a polypeptide having hydrogenase activity may include providing a polypeptide described herein, and incubating the polypeptide under conditions suitable for producing NADPH. The produced NADPH may be collected.

In one aspect, the polypeptide is an isolated or purified polypeptide. The conditions may include molecular hydrogen, and a temperature of at least 37° C. In another aspect, the polypeptide is present in a genetically modified microbe. The incubating may include incubating the microbial cell under conditions suitable for the expression of the polypeptide. The conditions may include a temperature of at least 37° C.

Also provided herein is an expression system for assembling a polypeptide having hydrogenase activity. The expression system includes the plasmids described herein. The plasmids may be present in a microbe, such as an E. coli.

As used herein, the term “polypeptide” refers broadly to a polymer of two or more amino acids joined together by peptide bonds. The term “polypeptide” also includes molecules which contain more than one polypeptide joined by a disulfide bond, or complexes of polypeptides that are joined together, covalently or noncovalently, as multimers (e.g., dimers, trimers, tetramers). A polypeptide also may possess non-protein (non-amino acid) ligands including, but not limited to, inorganic iron (Fe), nickel (Ni), inorganic iron-sulfur centers such as [4Fe-4S] clusters, and other organic ligands such as carbon monoxide (CO), cyanide (CN) and flavin. Thus, the terms peptide, oligopeptide, enzyme, subunit, and protein are all included within the definition of polypeptide and these terms are used interchangeably. It should be understood that these terms do not connote a specific length of a polymer of amino acids, nor are they intended to imply or distinguish whether the polypeptide is produced using recombinant techniques, chemical or enzymatic synthesis, or is naturally occurring. As used herein, “heterologous amino acid sequence” refers to amino acid sequences that are not normally present as part of a polypeptide present in a wilt-type cell. For instance, “heterologous amino acid sequence” includes extra amino acids at the amino terminal end or carboxy terminal of a polypeptide that are not normally part of a polypeptide that is present in a wild-type cell.

As used herein, “hydrogenase activity” refers to the ability of a polypeptide to catalyze the formation of molecular hydrogen (H₂).

As used herein, “identity” refers to structural similarity between two polypeptides or two polynucleotides. The structural similarity between two polypeptides is determined by aligning the residues of the two polypeptides (e.g., a candidate amino acid sequence and a reference amino acid sequence, such as SEQ ID NO:2, 4, 6, or 8) to optimize the number of identical amino acids along the lengths of their sequences; gaps in either or both sequences are permitted in making the alignment in order to optimize the number of shared amino acids, although the amino acids in each sequence must nonetheless remain in their proper order. The structural similarity is typically at least 80% identity, at least 81% identity, at least 82% identity, at least 83% identity, at least 84% identity, at least 85% identity, at least 86% identity, at least 87% identity, at least 88% identity, at least 89% identity, at least 90% identity, at least 91% identity, at least 92% identity, at least 93% identity, at least 94% identity, at least 95% identity, at least 96% identity, at least 97% identity, at least 98% identity, or at least 99% identity. A candidate amino acid sequence can be isolated from a microbe, preferably a Pyrococcus spp., more preferably a P. furiosus, or can be produced using recombinant techniques, or chemically or enzymatically synthesized. Structural similarity may be determined, for example, using sequence techniques such as the BESTFIT algorithm in the GCG package (Madison Wis.), or the Blastp program of the BLAST 2 search algorithm, as described by Tatusova, et al. (FEMS Microbiol Lett 1999, 174:247-250), and available through the World Wide Web, for instance at the interne site maintained by the National Center for Biotechnology Information, National Institutes of Health. Preferably, structural similarity between two amino acid sequences is determined using the Blastp program of the BLAST 2 search algorithm. Preferably, the default values for all BLAST 2 search parameters are used, including matrix=BLOSUM62; open gap penalty=11, extension gap penalty=1, gap x_dropoff=50, expect=10, wordsize=3, and optionally, filter on. In the comparison of two amino acid sequences using the BLAST search algorithm, structural similarity is referred to as “identities.”

The structural similarity between two polynucleotides is determined by aligning the residues of the two polynucleotides (e.g., a candidate nucleotide sequence and a reference nucleotide sequence, such as SEQ ID NO:1, 3, 5, or 7) to optimize the number of identical nucleotides along the lengths of their sequences; gaps in either or both sequences are permitted in making the alignment in order to optimize the number of shared nucleotides, although the nucleotides in each sequence must nonetheless remain in their proper order. The structural similarity is typically at least 80% identity, at least 81% identity, at least 82% identity, at least 83% identity, at least 84% identity, at least 85% identity, at least 86% identity, at least 87% identity, at least 88% identity, at least 89% identity, at least 90% identity, at least 91% identity, at least 92% identity, at least 93% identity, at least 94% identity, at least 95% identity, at least 96% identity, at least 97% identity, at least 98% identity, or at least 99% identity. A candidate nucleotide sequence can be isolated from a microbe, preferably a Pyrococcus spp., more preferably a P. furiosus, or can be produced using recombinant techniques, or chemically or enzymatically synthesized. Structural similarity may be determined, for example, using sequence techniques such as GCG FastA (Genetics Computer Group, Madison, Wis.), MacVector 4.5 (Kodak/IBI software package) or other suitable sequencing programs or methods known in the art. Preferably, structural similarity between two nucleotide sequences is determined using the Blastn program of the BLAST 2 search algorithm, as described by Tatusova, et al. (1999. FEMS Microbiol Lett. 174:247-250), and available through the World Wide Web, for instance at the internet site maintained by the National Center for Biotechnology Information, National Institutes of Health. Preferably, the default values for all BLAST 2 search parameters are used, including reward for match=1, penalty for mismatch=−2, open gap penalty=5, extension gap penalty=2, gap x_dropoff=50, expect=10, wordsize=11, and optionally, filter on. In the comparison of two nucleotide sequences using the BLAST search algorithm, structural similarity is referred to as “identities.”

As used herein, an “isolated” substance is one that has been removed from its natural environment, produced using recombinant techniques, or chemically or enzymatically synthesized. For instance, a polypeptide, a polynucleotide, H₂, or NADPH can be isolated. Preferably, a substance is purified, i.e., is at least 60% free, preferably at least 75% free, and most preferably at least 90% free from other components with which it is naturally associated.

As used herein, the term “polynucleotide” refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxynucleotides, and includes both double- and single-stranded RNA and DNA. A polynucleotide can be obtained directly from a natural source, or can be prepared with the aid of recombinant, enzymatic, or chemical techniques. A polynucleotide can be linear or circular in topology. A polynucleotide may be, for example, a portion of a vector, such as an expression or cloning vector, or a fragment. A polynucleotide may include nucleotide sequences having different functions, including, for instance, coding regions, and non-coding regions such as regulatory regions.

As used herein, the terms “coding region,” “coding sequence,” and “open reading frame” are used interchangeably and refer to a nucleotide sequence that encodes a polypeptide and, when placed under the control of appropriate regulatory sequences expresses the encoded polypeptide. The boundaries of a coding region are generally determined by a translation start codon at its 5′ end and a translation stop codon at its 3′ end. A “regulatory sequence” is a nucleotide sequence that regulates expression of a coding sequence to which it is operably linked. Non-limiting examples of regulatory sequences include promoters, enhancers, transcription initiation sites, translation start sites, translation stop sites, and transcription terminators. The term “operably linked” refers to a juxtaposition of components such that they are in a relationship permitting them to function in their intended manner. A regulatory sequence is “operably linked” to a coding region when it is joined in such a way that expression of the coding region is achieved under conditions compatible with the regulatory sequence.

A polynucleotide that includes a coding region may include heterologous nucleotides that flank one or both sides of the coding region. As used herein, “heterologous nucleotides” refer to nucleotides that are not normally present flanking a coding region that is present in a wild-type cell. For instance, a coding region present in a wild-type microbe and encoding a polypeptide described herein is flanked by homologous sequences, and any other nucleotide sequence flanking the coding region is considered to be heterologous. Examples of heterologous nucleotides include, but are not limited to regulatory sequences. Typically, heterologous nucleotides are present in a polynucleotide described herein through the use of standard genetic and/or recombinant methodologies well known to one skilled in the art. A polynucleotide described herein may be included in a suitable vector.

As used herein, an “exogenous polynucleotide” refers to a polynucleotide that is not normally or naturally found in a microbe. As used herein, the term “endogenous polynucleotide” refers to a polynucleotide that is normally or naturally found in a cell microbe. An “endogenous polynucleotide ” is also referred to as a “native polynucleotide.”

The terms “complement” and “complementary” as used herein, refer to the ability of two single stranded polynucleotides to base pair with each other, where an adenine on one strand of a polynucleotide will base pair to a thymine or uracil on a strand of a second polynucleotide and a cytosine on one strand of a polynucleotide will base pair to a guanine on a strand of a second polynucleotide. Two polynucleotides are complementary to each other when a nucleotide sequence in one polynucleotide can base pair with a nucleotide sequence in a second polynucleotide. For instance, 5′-ATGC and 5′-GCAT are complementary. The term “substantial complement” and cognates thereof as used herein, refer to a polynucleotide that is capable of selectively hybridizing to a specified polynucleotide under stringent hybridization conditions. Stringent hybridization can take place under a number of pH, salt and temperature conditions. The pH can vary from 6 to 9, preferably 6.8 to 8.5. The salt concentration can vary from 0.15 M sodium to 0.9 M sodium, and other cations can be used as long as the ionic strength is equivalent to that specified for sodium. The temperature of the hybridization reaction can vary from 30° C. to 80° C., preferably from 45° C. to 70° C. Additionally, other compounds can be added to a hybridization reaction to promote specific hybridization at lower temperatures, such as at or approaching room temperature. Among the compounds contemplated for lowering the temperature requirements is formamide. Thus, a polynucleotide is typically substantially complementary to a second polynucleotide if hybridization occurs between the polynucleotide and the second polynucleotide. As used herein, “specific hybridization” refers to hybridization between two polynucleotides under stringent hybridization conditions.

As used herein, “genetically modified microbe” refers to a microbe which has been altered “by the hand of man.” A genetically modified microbe includes a microbe into which has been introduced an exogenous polynucleotide, e.g., an expression vector. Genetically modified microbe also refers to a microbe that has been genetically manipulated such that endogenous nucleotides have been altered to include a mutation, such as a deletion, an insertion, a transition, a transversion, or a combination thereof. For instance, an endogenous coding region could be deleted. Such mutations may result in a polypeptide having a different amino acid sequence than was encoded by the endogenous polynucleotide. Another example of a genetically modified microbe is one having an altered regulatory sequence, such as a promoter, to result in increased or decreased expression of an operably linked endogenous coding region.

Conditions that are “suitable” for an event to occur, such as expression of an exogenous polynucleotide in a cell to produce a polypeptide, or production of molecular hydrogen or NADPH, or “suitable” conditions are conditions that do not prevent such events from occurring. Thus, these conditions permit, enhance, facilitate, and/or are conducive to the event.

The term “and/or” means one or all of the listed elements or a combination of any two or more of the listed elements.

The words “preferred” and “preferably” refer to embodiments of the invention that may afford certain benefits, under certain circumstances. However, other embodiments may also be preferred, under the same or other circumstances. Furthermore, the recitation of one or more preferred embodiments does not imply that other embodiments are not useful, and is not intended to exclude other embodiments from the scope of the invention.

The terms “comprises” and variations thereof do not have a limiting meaning where these terms appear in the description and claims.

Unless otherwise specified, “a,” “an,” “the,” and “at least one” are used interchangeably and mean one or more than one.

Also herein, the recitations of numerical ranges by endpoints include all numbers subsumed within that range (e.g., 1 to 5 includes 1, 1.5, 2, 2.75, 3, 3.80, 4, 5, etc.).

For any method disclosed herein that includes discrete steps, the steps may be conducted in any feasible order. And, as appropriate, any combination of two or more steps may be conducted simultaneously.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1. Construction of anaerobic expression vector pC11A-CDABI.

FIG. 2. Construction of anaerobic expression vector pC3AR-slyD.

FIG. 3. Construction of anaerobic expression vector pEA-SHI.

FIG. 4. Construction of anaerobic expression vector pRA-EF.

FIG. 5. Immunoanalysis using antibodies to the catalytic subunit (PF0894). MW 1001 SHICDABIEFSlyD, MW 1001 containing the coding regions HypC, HypD, HypF, HypE, HypA, HypB, HycI, and SlyD. Native Pf SHI, native P. furiosus SHOI hydrogenase.

FIG. 6. QPCR analysis of the expression of exogenous coding regions in E. coli.

FIG. 7. Amino acid sequence and nucleotide sequence of the polypeptides and polynucleotides referenced in Table 1. Coding regions and deduced polypeptide sequences of Pyrococcus furiosus DSM3638 used herein. All P. furiosus DNA and predicted protein sequences were derived from the deposited Genbank sequence NC_(—)003413. Accession numbers refer to specific sections of this DNA sequence or the translated open reading frames encoded therein. Sequence identification numbers for these sequences are shown in Table 1.

FIG. 8. Maps and complete nucleotide sequences of four expression vectors. pEA-SH1, SEQ ID NO:29; pC11A-CDABI, SEQ ID NO:30; pRA-EF, SEQ ID NO:31; and pC3AR-slyD, SEQ ID NO:32.

FIG. 9. MV (methyl viologen)-linked hydrogenase activity of native versus recombinant P. furiosus soluble hydrogenase I.

FIG. 10. Production of MV-Linked Hydrogenase activity at 80° C. in recombinant E. coli MW/rSHI-C. The results from two separate cultures (one indicated by circles, one by triangles) are shown. The growth curves are shown by solid symbols.

FIG. 11. High Density 5-Liter Controlled Fermentation of E. coli MW/rSHI-C.

FIG. 12. Recombinant Hydrogenase Purification Scheme.

FIG. 13. SDS Gel Analysis of Recombinant Hydrogenase Purification. WCE, whole cell extract; S100, cytoplasmic extract after a 100,000×g centrifugation; DEAE pool, pool from DEAE Sepharose column; and PS pool, pool from Phenyl Sepharose column. The PF numbers and the calculated molecular weights for the four subunits of the hydrogenase are indicated.

FIG. 14. SDS Gel Analysis of Highly Purified Recombinant Hydrogenase. PS pool, pool from Phenyl Sepharose column; native SHI, native hydrogenase purified from P. furiosus; S200, Sepharcryl S-200 eluate; HAP, Hydroxyapatite eluate.

FIG. 15. Metal Analysis of Phenyl Sepharose fractions.

FIG. 16. Thermal Sensitivity of Recombinant Hydrogenase.

FIG. 17. Oxygen Sensitivity of Recombinant Hydrogenase.

FIG. 18. Expected Interactions Between Tetrameric Recombinant Hydrogenase and MV and NADPH.

FIG. 19. Expected Interactions Between Dimeric Recombinant Hydrogenase and MV and NADPH.

FIG. 20. pEA-0893/0894 (plasmid map and nucleotide sequence, SEQ ID NO:33).

FIG. 21. Alignments of each of the four subunits of P. furiosus hydogenase I and other related hydrogenases from P. abyssi, P. horikoshii, and Thermococcus kodakaraensis. In each alignment identical residues are not shaded, similar residues are boxed, and non-similar residues are shaded dark gray. In each alignment, PF, P. furiosus; PAB, P. abyssi; TK, Thermococcus kodakaraensis; and PH, P. horikoshii. The gene identifiers refer to the coding regions encoding each polypeptide. PF0891-PF0894 (SEQ ID NOs:2, 4, 6, and 8, respectively) refers to the coding regions present at Genbank Accession No. NC_(—)003413; PAB1784-PAB1787 (SEQ ID NOs:34, 35, 36, and 37, respectively) refers to the coding regions present at Genbank Accession No. AL096836; TK2069-TK2072 (SEQ ID NOs:38, 39, 40, and 41, respectively) refers to the coding regions present at Genbank Accession No. NC_(—)006624; and PH1290-1294 (SEQ ID NOs:42, 43, 44, and 45, respectively) refers to the coding regions present at Genbank Accession No. NC_(—)000961. A. Alignment of the beta subunits. B. Alignment of the gamma subunits. C. Alignment of the delta subunits. D. Alignment of the alpha subunits.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

The expression of a NiFe-hydrogenase from an extremophile is expected to be inactive and unfolded and consequently not stable when expressed in Escherichia coli. We expressed the catalytic subunit (SEQ ID NO:8) in E. coli and to our surprise found that the monomeric subunit was stable. However, the stable expression of one subunit did not indicate that the other structural and accessory proteins would also be stable, and it was expected that chaperones (to stabilize unfolded protein) would be required for the proper assembly of the NiFe site. Furthermore, successful heterologous expression, meaning expression (transcription and translation) of genes not normally found in a given cell, of genes that encode such a molecular machine as a NiFe-hydrogenase has not been possible, in part because there are a large number of accessory proteins involved in its assembly. Despite the fact that the host bacterium used here, E. coli synthesizes its own native hydrogenases (all integral membrane proteins) under anaerobic conditions, attempts to express the genes encoding hydrogenases from other organisms have typically not been done in E. coli, but rather in very closely related organisms (Bascones et al. 2000. Appl Environ Microbiol 66:4292-9; King et al. 2006. J Bacteriol 188:2163-72; Lenz et al. 2005. J Bacteriol 187:6590-5; Morimoto et al. 2005. FEMS Microbiology Letters 246:229-34; Porthun et al. 2002. Arch Microbiol 177:159-66; Rousset et al. 1998. Journal of Bacteriology 180:4982-4986). Only recently have attempts been made to express hydrogenases (from Synechocystis sp.) in E. coli (Maeda et al. 2007. BMC Biotechnol 7:25) and this apparently only has the effect of limiting H₂ uptake in the recombinant strains. Proteins playing a role in the assembly of NiFe hydrogenases in E. coli have been extensively characterized (Bock et al. 2006. Adv Microb Physiol 51:1-71), and homologs of the genes encoding eight of these proteins exist in P. furiosus. Described herein is a system for successful heterologous overexpression of a functional and tagged hyperthermophilic NiFe hydrogenase under anaerobic conditions in the common laboratory protein expression host bacterium E. coli, using the heterologously-expressed accessory proteins from P. furiosus while simultaneously expressing those encoding the protein components of P. furiosus hydrogenase.

Provided herein are polypeptides having hydrogenase activity. Such polypeptides may be referred to herein as hydrogenase polypeptides. A polypeptide having hydrogenase activity may include four subunits. The first subunit includes the amino acid sequence SEQ ID NO:2, or an amino acid sequence having structural similarity thereto, the second subunit includes the amino acid sequence SEQ ID NO:4 or an amino acid sequence having structural similarity thereto, the third subunit includes the amino acid sequence SEQ ID NO:6 or an amino acid sequence having structural similarity thereto, and the fourth subunit includes the amino acid sequence SEQ ID NO:8 or an amino acid sequence having structural similarity thereto. Such a polypeptide may be isolated from a microbe, such as thermophiles (prokaryotic microbes that grow in environments at temperatures of between 60° C. and 79° C.), and hyperthermophiles (prokaryotic microbes that grow in environments at temperatures above 80° C.). Examples include archaea such as, but not limited to, a member of the genera Pyrococcus, for instance P. furiosus, P. abyssi, or P. horikoshii, or a member of the genera Thermococcus, for instance, T. kodakaraensis or T. onnurineus, or may be produced using recombinant techniques, or chemically or enzymatically synthesized.

A polypeptide provided herein also includes various subcomplexes. A subcomplex is defined as an engineered version of the hydrogenase polypeptide containing less than the natively purified four subunits. For example, a subcomplex may be the alpha subunit alone (SEQ ID NO: 8), the alpha subunit with one other subunit, (SEQ ID NO: 6, 4 or 2), or the alpha subunit with some combination of the two other subunits. Accordingly, a hydrogenase polypeptide may be monomeric, dimeric, trimeric, or tetrameric. One example of a a hydrogenase polypeptide has 2 subunits, a first subunit that includes the amino acid sequence SEQ ID NO:8, or an amino acid sequence having structural similarity thereto, and a second subunit that includes the amino acid sequence SEQ ID NO:6 or an amino acid sequence having structural similarity thereto.

The hydrogenase activity of a hydrogenase polypeptide of the present invention may be determined by routine methods known in the art. Preferably, a hydrogen evolution assay is used as described herein. For instance, a cell extract may be tested for hydrogen evolution after preparation of a whole cell extract, centrifugation at 100,000×g, heat-treatment at 80° C. for 30 minutes, and re-centrifugation at 100,000×g (referred to as an S100 fraction). The standard assay conditions may include using 5 mL stoppered vials containing 2 mL of anaerobic 100 mM EPPS buffer pH 8.4, 10 mM sodium dithionite, and 1 mM Methyl Viologen under an atmosphere of argon. Typically, 0.5 milligrams of protein is added when measuring the activity of protein from an 80° C.-treated S100 fraction, and no greater than 0.005 milligrams of protein is added when measuring the activity of protein from a column, such as a DEAF Sepharose and/or Phenyl Sepharose column. The vials are preheated at 80° C. for 1 minute, and 200 μL of sample is injected into the vial. After a period of time, for instance, 6 minutes, samples (100 μL) of the headspace of the sealed vial can be removed with a gas-tight syringe, and then injected into a gas chromatograph. The resulting hydrogen peak can be compared to a known standard curve to calculate micromoles of hydrogen produced per mL of assay solution. The specific activity is at least 0.05, at least 0.1, or at least 0.125 micromoles H₂ produced min⁻¹ mg protein⁻¹. If the hydrogenase polypeptide is is further purified, for instance using column chromatography with DEAF Sepharose or a similar matrix, and Phenyl Sepharose or a similar matrix, as described herein, the specific activity is at least 0.5, at least 1, least 5, or at least 7.5 micromoles H₂ produced min⁻¹ mg protein⁻¹. A hydrogenase polypeptide described herein that is to be tested may be expressed in a microbe, preferably an E. coli described herein, or produced using recombinant techniques, chemical or enzymatic synthesis. If the hydrogenase polypeptide is expressed in a microbe, preferably the microbe has undetectable levels of endogenous hydrogenase activity. Since most microbes do naturally express hydrogenase activity, microbes useful for expression of the hydrogenase polypeptides described herein may be engineered to not express endogenous hydrogenase activity. An example of such a microbe is MW1001 (Maeda et al. 2007. BMC Biotechnol 7:25). Other microbes can be engineered using methods known in the art to not express endogenous hydrogenase activity.

A hydrogenase polypeptide described herein typically has additional characteristics, including heat activation. A hydrogenase polypeptide described herein is typically activated by incubation at an elevated temperature. For instance, if a hydrogenase polypeptide is produced at temperatures prevalent when using E. coli to produce the polypeptide, e.g., 37° C., the specific activity can be increased by incubation at a temperature of at least 70° C., or at least 80° C. A hydrogenase polypeptide described herein also has the characteristic of being stable stable to incubation at high temperature. For instance, a hydrogenase polypeptide described herein does not lose any of its activity after incubation 90° C. for 10 hours. A hydrogenase polypeptide described herein also has the characteristic of being as sensitive to oxygen as the native form of the enzyme purifed from P. furiosus. A hydrogenase polypeptide described herein that has hydrogenase activity catalyzes the proton reduction (H₂ production) coupled to the oxidation of an electron donor, such as NADPH, and also catalyzes the reverse, i.e., the oxidation of H₂ coupled to the reduction of an electron acceptor, such as NADP. Another reaction that may be catalyzed by hydrogenase polypeptides described herein is the reduction of elemental sulfur to hydrogen sulfide with the use of molecular hydrogen (Kim et al. 1999. Biotechnol. Bioeng. 65:108-113; Ma et al., Proc. Nat. Acad. Sci. USA. 90:5341-5344).

A candidate polypeptide having structural similarity to a reference polypeptide may include conservative substitutions of amino acids present in the reference polypeptide. A conservative substitution is typically the substitution of one amino acid for another that is a member of the same class. For example, it is well known in the art of protein biochemistry that an amino acid belonging to a grouping of amino acids having a particular size or characteristic (such as charge, hydrophobicity, and/or hydrophilicity) can generally be substituted for another amino acid without substantially altering the secondary and/or tertiary structure of a polypeptide. For the purposes of this invention, conservative amino acid substitutions are defined to result from exchange of amino acids residues from within one of the following classes of residues: Class I: Gly, Ala, Val, Leu, and Ile (representing aliphatic side chains); Class II: Gly, Ala, Val, Leu, Ile, Ser, and Thr (representing aliphatic and aliphatic hydroxyl side chains); Class III: Tyr, Ser, and Thr (representing hydroxyl side chains); Class IV: Cys and Met (representing sulfur-containing side chains); Class V: Glu, Asp, Asn and Gln (carboxyl or amide group containing side chains); Class VI: His, Arg and Lys (representing basic side chains); Class VII: Gly, Ala, Pro, Trp, Tyr, Ile, Val, Leu, Phe and Met (representing hydrophobic side chains); Class VIII: Phe, Trp, and Tyr (representing aromatic side chains); and Class IX: Asn and Gln (representing amide side chains).

There are eight major groups of hydrogenase based on sequence similarities of their catalytic subunits (Vignais and Billoud. 2007. Chem Rev 107:4206-72). Hydrogenase polypeptides described herein are members of group 3b, the bidirectional NAD(P)-linked hydrogenases, and include, for instance, those found in other Pyrococcus and closely related species, e.g., Thermococcus, and also in photosynthetic bacteria (Thiocapsa) and aerobic hydrogen bacteria (Ralstonia). All [NiFe] hydrogenases (from all groups) are characterized by two CxxC domains, termed L1 and L2, that coordinate the Ni and Fe atom at the catalytic site of the catalytic subunit, alpha, an example of which is shown at SEQ ID NO:8. Each of the groups has conserved sequences surrounding these sites. The consensus L1 site is R[IV]C[AGS][FIL]Cxxx[HY]xx[AST][ANS]xx[AS][AILV] (SEQ ID NO:46), where x is any amino acid, and where one amino acid is chosen from each set enclosed by brackets (e.g., the second amino acid of the consensus is I or V). Examples of L1 sites include, but are not limited to, RICSFCSAAHKLTALEAA (SEQ ID NO:47), and RVCGICSAAHKLTALEAA (SEQ ID NO:48). The consensus L2 site is R[ANS][FHY]DPCISC[AS][ATV]H (SEQ ID NO:49), where one amino acid is chosen from each set enclosed by brackets (e.g., the second amino acid of the consensus is A or N or S). In both L1 and L2 sites, the change of any of the four cysteines is expected to result in a decrease or complete loss of hydrogenase activity. Further, regions of conservation can be determined by comparison of the amino acid sequences of each subunit (SEQ ID NO:2, 4, 6, or 8) with other hydrogenase subunits from other organisms (see FIG. 21). Thus, the skilled person can easily determine which amino acid residues can be altered without any effect on hydrogenase activity, and which cannot be changed or can be altered only through use of conservative substitutions.

Guidance concerning how to make phenotypically silent amino acid substitutions is provided in Bowie et al. (1990. Science, 247:1306-1310), wherein the authors indicate proteins are surprisingly tolerant of amino acid substitutions. For example, Bowie et al. disclose that there are two main approaches for studying the tolerance of a polypeptide sequence to change. The first method relies on the process of evolution, in which mutations are either accepted or rejected by natural selection. The second approach uses genetic engineering to introduce amino acid changes at specific positions of a cloned gene and selects or screens to identify sequences that maintain functionality. As stated by the authors, these studies have revealed that proteins are surprisingly tolerant of amino acid substitutions. The authors further indicate which changes are likely to be permissive at a certain position of the protein. For example, most buried amino acid residues require non-polar side chains, whereas few features of surface side chains are generally conserved. Other such phenotypically silent substitutions are described in Bowie et al, and the references cited therein.

A candidate polypeptide having structural similarity to one of the polypeptides SEQ ID NO:2, 4, 6, or 8 has hydrogenase activity when expressed in a microbe with the other 3 reference structural polypeptides and the other 8 reference accessory polypeptides (SEQ ID NO:s:10, 12, 14, 16, 18, 20, 22, and 24, described in detail below). For instance, when determining if a candidate polypeptide having some level of identity to SEQ ID NO:2 has hydrogenase activity, the candidate polypeptide is expressed in a microbe with reference polypeptides SEQ ID NO: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, and 24. Likewise, when determining if a candidate polypeptide having some level of identity to SEQ ID NO:4 has hydrogenase activity, the candidate polypeptide is expressed in a microbe with reference polypeptides SEQ ID NO: 2, 6, 8, 10, 12, 14, 16, 18, 20, 22, and 24, and so on for determining hydrogenase activity of candidate polypeptides having identity to each of the other structural or accessory polypeptides.

P. furiosus contains a second hydrogenase (SH-II) that is highly similar to the hydrogenase polypeptides described herein. SH-II was purified from native biomass of P. furiosus (Ma et al., 2000. J Bacteriol. 182(7):1864-71). It has very similar catalytic properties, and virtually identical physical properties to those of the hydrogenase polypeptides described herein. It contains four subunits of very similar size to those of the hydrogenase polypeptides described herein and these are predicted to coordinate exactly the same cofactors as the subunits of the hydrogenase polypeptides described herein. However, the sequences show only 55-63% sequence similarity. Nevertheless, P. furiosus has only one set of accessory genes to process and mature a hydrogenase, and so it is predicted that the set of accessory coding regions described herein that are used by P. furiosus to process the hydrogenase polypeptides described herein must also be used by the organism to process SH-II. Despite the apparent lack of sequence similarity the SH-I alpha and SH-II alpha subunits share a high degree of identity in the conserved L2 region and the C-terminal sequence that is cleaved for hydrogenase activity. Therefore, it is expected that the E. coli expression system described herein, which includes the accessory genes of P. furiosus, would also process and produce an active form of SH-II. In this case the plasmid containing the four SH-I genes would be replaced in E. coli by one containing the four SH-II genes.

Also provided are isolated polynucleotides encoding the polypeptides described herein. For instance, a polynucleotide may have a nucleotide sequence encoding a polypeptide having the amino acid sequence shown in SEQ ID NOs:2, 4, 6, or 8, and an example of the class of nucleotide sequences encoding each polypeptide is SEQ ID NOs:1, 3, 5, 7, respectively. It should be understood that a polynucleotide encoding a polypeptides represented by one of the sequences disclosed herein, e.g., SEQ ID NOs:2, 4, 6, or 8, is not limited to the nucleotide sequence disclosed at the polynucleotide sequences disclosed herein, e.g., SEQ ID NOs:1, 3, 5, or 7, respectively, but also includes the class of polynucleotides encoding such polypeptides as a result of the degeneracy of the genetic code. For example, the naturally occurring nucleotide sequence SEQ ID NO:1 is but one member of the class of nucleotide sequences encoding a polypeptide having the amino acid sequence SEQ ID NO:2. Likewise, the naturally occurring nucleotide sequences SEQ ID NO:3, 5, or 7, are but single members of the class of nucleotide sequences encoding a polypeptide having the amino acid sequence SEQ ID NO:4, 6, or 8, respectively. The class of nucleotide sequences encoding a selected polypeptide sequence is large but finite, and the nucleotide sequence of each member of the class may be readily determined by one skilled in the art by reference to the standard genetic code, wherein different nucleotide triplets (codons) are known to encode the same amino acid.

A polynucleotide disclosed herein may have structural similarity with the nucleotide sequence of SEQ ID NO:1, 3, 5, or 7. Such a polynucleotide may be isolated from a microbe, such as thermophiles (prokaryotic microbes that grow in environments at temperatures of between 60° C. and 79° C.), and hyperthermophiles (prokaryotic microbes that grow in environments at temperatures above 80° C.). Examples include archaea such as, but not limited to, a member of the genera Pyrococcus, for instance P. furiosus, P. abyssi, or P. horikoshii, or a member of the genera Thermococcus, for instance, T. kodakaraensis or T. onnurineus, or may be produced using recombinant techniques, or chemically or enzymatically synthesized. A polynucleotide disclosed herein may further include heterologous nucleotides flanking the open reading frame. Typically, heterologous nucleotides may be at the 5′ end of the coding region, at the 3′ end of the coding region, or the combination thereof. The number of heterologous nucleotides may be, for instance, at least 10, at least 100, or at least 1000.

An aspect of the present invention also includes fragments of the polypeptides described herein, and the polynucleotides encoding such fragments, such as SEQ ID NOs:2, 4, 6, and 8, as well as those polypeptides having structural similarity to SEQ ID NOs: 2, 4, 6, and 8. A polypeptide fragment may include a sequence of at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90, at least 95, or at least 100 amino acid residues.

A polypeptide described herein or a fragment thereof may be expressed as a fusion polypeptide that includes a polypeptide of the present invention or a fragment thereof and a heterologous amino acid sequence. The heterologous amino acid sequence may be present at the amino terminal end or the carboxy terminal end of a polypeptide, or it may be present within the amino acid sequence of the polypeptide. For instance, the heterologous amino acid sequence may be useful for purification of the fusion polypeptide by affinity chromatography. Various methods are available for the addition of such affinity purification tags to proteins. Examples of tags include a polyhistidine-tag, maltose-binding protein, and Strep-tag®. Representative examples may be found in Hopp et al. (U.S. Pat. No. 4,703,004), Hopp et al. (U.S. Pat. No. 4,782,137), Sgarlato (U.S. Pat. No. 5,935,824), Sharma (U.S. Pat. No. 5,594,115, and Skerra and Schmidt, 1999, Biomol Eng. 16:79-86). In another example, the heterologous amino acid sequence may be a carrier polypeptide. The carrier polypeptide may be used to increase the immunogenicity of the fusion polypeptide to increase production of antibodies that specifically bind to a polypeptide of the invention. The invention is not limited by the types of carrier polypeptides that may be used to create fusion polypeptides. Examples of carrier polypeptides include, but are not limited to, keyhole limpet hemacyanin, bovine serum albumin, ovalbumin, mouse serum albumin, rabbit serum albumin, and the like. The heterologous amino acid sequence, for instance, a tag or a carrier, may also include a cleavable site that permits removal of most or all of the additional amino acid sequence. Examples of cleavable sites are known to the skilled person and routinely used, and include, but are not limited to, a TEV protease recognition site. The number of heterologous amino acids may be, for instance, at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, or at least 40.

A polypeptide described herein may be modified. An example of a modification is a chemical modification with a hydrophobic group. Examples of suitable hydrophobic groups include, but are not limited to, polyethylene glycol derivatives, such as polyoxyethylene glycol p-nitrophenyl carbonate (PEG-pNPC), methoxypolyethylene glycol p-nitrophenyl carbonate (MPEG-pNPC), and methoxypolyethylene glycol cyanuric chloride (MPEG-CC). Preferably, the molecular weight of a polyethylene glycol derivative is less than 5 KDa. Methods for chemically modifying polypeptides are routine and known in the art. Such modified polypeptides can have altered characteristics such as increased solubility in organic solvents while retaining enzymatic activity. An example is modification of a polypeptide described herein is taught by Kim et al. (1999. Biotechnol. Bioeng. 65:108-113), where an SH-I hydrogenase polypeptide obtained from P. furiosus was modified with MPEG-CC. The resulting polypeptide retained the ability to reduce elemental sulfur to hydrogen sulfide (Ma et al., Proc. Nat. Acad. Sci. USA. 90:5341-5344).

A polynucleotide disclosed herein can be present in a vector. A vector is a replicating polynucleotide, such as a plasmid, phage, or cosmid, to which another polynucleotide may be attached so as to bring about the replication of the attached polynucleotide. Construction of vectors containing a polynucleotide of the invention may employ standard ligation techniques known in the art. See, e.g., (Sambrook et al., 1989. Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.). A vector can provide for further cloning (amplification of the polynucleotide), i.e., a cloning vector, or for expression of the polynucleotide, i.e., an expression vector. The term vector includes, but is not limited to, plasmid vectors, viral vectors, cosmid vectors, and artificial chromosome vectors. Preferably the vector is a plasmid.

Selection of a vector depends upon a variety of desired characteristics in the resulting construct, such as a selection marker, vector replication rate, and the like. Vectors can be introduced into a host cell using methods that are known and used routinely by the skilled person. The vector may replicate separately from the chromosome present in the microbe, or the polynucleotide may be integrated into a chromosome of the microbe.

An expression vector may optionally include a promoter that results in expression of an operably linked coding regino during growth in anaerobic conditions. Promoters act as regulatory signals that bind RNA polymerase in a cell to initiate transcription of a downstream (3′ direction) coding region. The promoter used may be a constitutive or an inducible promoter. It may be, but need not be, heterologous with respect to a host cell. Examples of suitable promoters include, but are not limited to, P-hya (SEQ ID NO:25), P-hyc (SEQ ID NO:26), and P-xyl (SEQ ID NO:27). The hydrogenase promoters P-hya and P-hyc can be obtained from E. coli, and are expressed (and at different strengths) under anaerobic growth conditions and at undetectable levels under aerobic growth conditions. The xylose responsive promoter P-xyl is a slightly modified version of the B. megaterium xylose promoter (Qazi et al. 2001. Microb Ecol 41:301-309) denoted PxylA (Rygus et al. 1991. Arch Microbiol 155:535-42) (P-xyl, SEQ ID NO:27). This xylose promoter was discovered to be useful for expressing genes in E. coli under either aerobic or anaerobic conditions. This is a promoter sequence derived from an aerobic, gram positive organism (rather than from E. coli, which is a facultatively anaerobic gram negative organism), and it was not expected that this would function in E. coli. Fortuitiously, we discovered that in E. coli it expresses at very high levels under both aerobic and anaerobic conditions.

It should be understood that a promoter that drives expression of an operably linked coding region during growth in anaerobic conditions is not limited to the nucleotide sequences disclosed at SEQ ID NOs:25, 26, or 27. A person of ordinary skill will understand that the promoters disclosed herein may be modified by substitution (such as transition or transversion), deletion, and/or insertion of one or more nucleotides, where the altered promoter maintains its ability to drive expression of an operably linked coding region during growth in anaerobic conditions. Such modified promoters can be easily constructed using routine methods known in the art such as classical mutagenesis, site-directed mutagenesis, and DNA shuffling. Other useful promoters can be obtained from the genomes of microbes by reference to the regions upstream of coding sequences that are expressed under anaerobic conditions, such as coding regions encoding hydrogenase enzymes or involved in anaerobic respiration.

A vector introduced into a host cell optionally includes one or more marker sequences, which typically encode a molecule that inactivates or otherwise detects or is detected by a compound in the growth medium. For example, the inclusion of a marker sequence may render the transformed cell resistant to an antibiotic, or it may confer compound-specific metabolism on the transformed cell. Examples of a marker sequence include, but are not limited to, sequences that confer resistance to kanamycin, ampicillin, chloramphenicol, tetracycline, streptomycin, and neomycin.

Provided herein is a series of expression vectors which express recombinant proteins under strictly anaerobic growth conditions in a microbe, preferably E. coli. No E. coli protein expression vectors currently used are capable of this. In fact, most E. coli expression systems use a modified bacteriophage T7 promoter, regulated by a modification of the E. coli lactose operon repressor, so that expression of target genes can be induced by addition of lactose or the lactose homolog isopropyl-β-D-thiogalactopyranoside (IPTG) (Studier, F. W. 2005. Protein Expr Purif 41:207-34; Terpe, 2006. Appl Microbiol Biotechnol 72:211-22). However, this system does not operate under strictly anaerobic conditions and herein we utilized promoters that E. coli uses when grown in the absence of air. The expression vectors include a P-hly, P-hlc, or P-xyl promoter. An expression vector may include other polynucleotides that aid in, for instance, the cloning, manipulation, or expression of an operably linked coding region, or the purification of a polypeptide encoded by the coding region.

Polypeptides and fragments thereof described herein may be produced using recombinant DNA techniques, such as an expression vector present in a cell. Such methods are routine and known in the art. The polypeptides and fragments thereof may also be synthesized in vitro, e.g., by solid phase peptide synthetic methods. Solid phase peptide synthetic methods are routine and known in the art. A polypeptide produced using recombinant techniques or by solid phase peptide synthetic methods may be further purified by routine methods, such as fractionation on immunoaffinity or ion-exchange columns, ethanol precipitation, reverse phase HPLC, chromatography on silica or on an anion-exchange resin such as DEAE, chromatofocusing, SDS-PAGE, ammonium sulfate precipitation, gel filtration using, for example, Sephadex G-75, or ligand affinity. A preferred method for isolating and optionally purifiying a hydrogenase polypeptide described herein includes column chromatography using, for instance, ion exchange chromatography, such as DEAE sepharose, hydrophobic interaction chromatography, such as phenyl sepharose, or the combination thereof.

Polynucleotides of the present invention may be obtained from microbes, or produced in vitro or in vivo. For instance, methods for in vitro synthesis include, but are not limited to, chemical synthesis with a conventional DNA/RNA synthesizer. Commercial suppliers of synthetic polynucleotides and reagents for such synthesis are well known.

Also disclosed herein are genetically modified microbes that have exogenous polynucleotides encoding one or more of the polypeptides disclosed herein. Compared to a control microbe that is not genetically modified, a genetically modified microbe may exhibit production of a hydrogenase polypeptide, such as a tetrameric or a dimeric hydrogenase polypeptide. Accordingly, in one aspect of the invention a genetically modified microbe may include one or more exogenous polynucleotides that encode the subunits of a hydrogenase polypeptide. Exogenous polynucleotides encoding a hydrogenase polypeptide may be present in the microbe as a vector or integrated into a chromosome.

Examples of useful bacterial host cells include, but are not limited to, Escherichia (such as Escherichia coli), Salmonella (such as Salmonella enterica, Salmonella typhi, Salmonella typhimurium), a Thermotoga spp. (such as T. maritime), an Aquifex spp (such as A. aeolicus), photosynthetic organisms including cyanobacteria (such as a Synechococcus spp. such as Synechococcus sp. WH8102 or Synechocystis spp. such as Synechocystis PCC 6803) and photosynthetic bacteria (such as a Rhodobacter spp. such as Rhodobacter sphaeroides) and the like. Examples of useful archaeal host cells include, but are not limited to a Pyrococcus spp., such as P. furiosus, P. abyssi, and P. horikoshii, a Sulfolobus spp, such as S. solfataricus, a Thermococcus spp., such as T. kodakaraensis, and the like.

A genetically modified microbe having exogenous polynucleotides encoding one or more of the polypeptides disclosed herein may optionally include accessory polypeptides. These accessory polypeptides act to assemble the hydrogenase polypeptides described herein. Without intending to be limiting, it is believed the accessory polypeptides play a role in constructing the non-protein ligands present in the hydrogenase polypeptides. The accessory polypeptides include a first accessory polypeptide having the amino acid sequence SEQ ID NO:10 or an amino acid sequence having structural similarity thereto, a second accessory polypeptide having the amino acid sequence SEQ ID NO:12 or an amino acid sequence having structural similarity thereto, a third accessory polypeptide having the amino acid sequence SEQ ID NO:14 or an amino acid sequence having structural similarity thereto, a fourth accessory polypeptide having the amino acid sequence SEQ ID NO:16 or an amino acid sequence having structural similarity thereto, a fifth accessory polypeptide having the amino acid sequence SEQ ID NO:18 or an amino acid sequence having structural similarity thereto, a sixth accessory polypeptide having the amino acid sequence SEQ ID NO:20 or an amino acid sequence having structural similarity thereto, a seventh accessory polypeptide having the amino acid sequence SEQ ID NO:22 or an amino acid sequence having structural similarity thereto, and an eighth accessory polypeptide having the amino acid sequence SEQ ID NO:24 or an amino acid sequence having structural similarity thereto. Preferably, an exogenous polynucleotide encoding an accessory polypeptide is operably linked to a promoter that drives expression of the polynucleotide during growth in anaerobic conditions.

Also provided herein are isolated polypeptides having the amino acid sequence SEQ ID NOs:10, 12, 14, 16, 18, 20, 22, and 24, and amino acid sequences having structural similarity thereto, and isolated polynucleotides encoding the polypeptides.

A candidate polypeptide having structural similarity to one of the accessory polypeptides (SEQ ID NOs: 10, 12, 14, 16, 18, 20, 22, or 24) has activity when expressed in a microbe with the 4 reference polypeptides encoding a tetrameric hydrogenase polypeptide and the other 7 reference accessory polypeptides. For instance, when determining if a candidate polypeptide having some level of identity to SEQ ID NO:10 has the activity of catalyzing the biosynthesis of an active hydrogenase polypeptide, the candidate polypeptide is expressed in a microbe with reference polypeptides SEQ ID NO: 2, 4, 6, 8, 12, 14, 16, 18, 20, 22, and 24. Likewise, when determining if a candidate polypeptide having some level of identity to SEQ ID NO:12 has the activity of catalyzing the biosynthesis of an active hydrogenase polypeptide, the candidate polypeptide is expressed in a microbe with reference polypeptides SEQ ID NO: 2, 4, 6, 8, 10, 14, 16, 18, 20, 22, and 24, and so on.

In another aspect a genetically modified microbe may express an endogenous hydrogenase polypeptide at an increased level or having altered activity. For instance, a genetically modified microbe may include an altered regulatory sequence, where the altered regulatory sequence is operably linked to one or more coding regions encoding subunits of a hydrogenase polypeptide. In another example, an endogenous polynucleotide encoding a subunit of a hydrogenase polypeptide may include a mutation, such as a deletion, an insertion, a transition, a transversion, or a combination thereof, that alters a characteristic of the hydrogenase polypeptides, such as the activity. In those aspects where a genetically modified microbe expresses an endogenous hydrogenase polypeptide at an increased level or having altered activity, the microbe is typically an archaea, such as Pyrococcus spp., such as P. furiosus, P. abyssi, and P. horikoshii, a Thermococcus spp., such as T. kodakaraensis and T. onnurineus, and the like. Methods for modifying genomic DNA sequences of thermophiles and hyperthermophiles are known (Yang et al., PCT Application No. PCT/US2008/081157, filed Oct. 24, 2008, and Westpheling et al., U.S. Provisional Patent Application 61/000,338, filed Oct. 25, 2007).

A genetically modified microbe may include other modifications in addition to exogenous polynucleotides encoding one or more of the polypeptides disclosed herein, or expressing an endogenous hydrogenase polypeptide at an increased level or having altered activity. Such modifications may provide for increased production of electron donors used by a hydrogenase polypeptide described herein, such as NADPH. For instance, modifications may provide for increased levels in a cell of the enzymes used in the oxidative phase of the pentose phosphate pathway, such as glucose 6-phosphate dehydrogenase, 6-phosphogluconolactonase, and 6-phosphogluconate dehydrogenase. Modifications may provide for increased levels of substrates used in the oxidative phase of the pentose phosphate pathway by, for instance, increasing production of enzymes in biosynthetic pathways, reducing feedback inhibition at different locations in biosynthetic pathways, increasing importation of substrates and/or compounds used in biosynthetic pathways to make substrates, decreasing catabolism of substrates and/or compounds used in biosynthetic pathways to make substrates. Methods for modifying microbes to increase these and other compounds are routine and known in the art.

A genetically modified microbe of the present invention may include other modifications that provide for increased ability to use renewable resources, such as, but not limited to, biomass containing polysaccharides that can be broken down to yield glucose 6-phosphate, the first reactant of the pentose phosphate pathway and the substrate of the enzyme glucose 6-phosphate dehydrogenase. An example of such a polysaccharide is starch. Such modifications may provide for increased production of enzymes useful in the breakdown of biomass.

The hydrogenase polypeptides described herein can be used to produce molecular hydrogen. Molecular hydrogen is used in the petroleum and chemical industries. For instance, in a petrochemical plant, hydrogen is used for hydrodealkylation, hydrodesulfurization, and hydrocracking, all methods of refining crude oil for wider use. Molecular hydrogen is used for the production of ammonia, methanol, hydrochloric acid, and as a reducing agent for metal ores. In the food industry molecular hydrogen is used for hydrogenation of vegetable oils and fats, for instance, in producing margarine from liquid vegetable oil. Hydrogen is also useful as a fuel, both in traditional combustion engines as well as in fuel cells, and produces only water vapor when oxidized with oxygen.

In addition to hydrogen production systems, the applications for hydrogenase polypeptides described herein include cofactor [beta-1,4-nicotinamide adenindinucleotide, reduced form (NADH) or beta-1,4-nicotinamide adenindinucleotide phosphate, reduced form (NADPH)] regeneration (from NAD or NADP, respectively) using hydrogen as the source of energy (Hummel, 1999. Trends Biotechnol. 17:487-492; Mertens et al,. 2003. J. Mol. Catal. B: Enzym. 24-25:39-52). The hydrogenase polypeptides described herein have significant advantages over other enzymatic methods to regenerate these reduced cofactors as there is no oxidation product to remove or dispose of other than protons (from hydrogen oxidation). This is in contrast to, for example, lactate dehydrogenase, where lactate is the source of energy and the product is the C3 compound pyruvate (Eberly and Ely, 2008. Crit. Rev. Microbiol. 34:117-130). Cofactor regeneration using hydrogen with no waste products would be of tremendous benefit for the pharmaceutical industry.

Hydrogenase polypeptides obtained from P. furiosus have also been chemically modified such that the enzyme is soluble and active in water-immicible organic solvents such as toluene (Kim et al. 1999. Biotechnol. Bioeng. 65:108-113). Hydrogenase polypeptides described herein can also be chemically modified. Thus, the polypeptides described herein can reduce water-insoluble compounds with hydrogen. For example, elemental sulfur can be reduced to H₂S, which is useful in removal of sulfur from some compositions used in the petroleum and coal industries.

Accordingly, provided herein are methods for making and using the hydrogenase polypeptides of the present invention. Methods for making a polypeptide having hydrogenase activity can include providing a genetically modified microbe that includes exogenous polynucleotides encoding 1, 2, 3, or 4 subunits of a hydrogenase polypeptide described herein, preferably 2 or 4 subunits, and incubating the microbe under conditions suitable for expression of the exogenous polynucleotides to produce a polypeptide, wherein the polypeptide has hydrogenase activity. The genetically modified microbe can be a bacterial cell, such as a gram negative, for instance, E. coli, or it can be an archaeal cell, for instance, a member of the genera Pyrococcus, for instance P. furiosus, P. abyssi, or P. horikoshii, or a member of the genera Thermococcus, for instance, T. kodakaraensis or T. onnurineus, or a photosynthetic bacterium; for instance, Rhodobacter sphaeroides. The genetically modified microbe may include exogenous polynucleotides encoding the accessory polypeptides described herein. In those aspects where the genetically modified microbe is a bacterial cell, such as E. coli, the genetically modified microbe typically does include exogenous polynucleotides encoding the accessory polypeptides. The incubation conditions are typically anaerobic, and the temperature may be at least 37° C., at least 60° C., at least 70° C., at least 80° C., or at least 90° C. The methods can be performed using any convenient manner. For instance, methods for growing microbial cells to high densities are routine and known in the art, and include batch and continuous fermentation processes. The method may further include isolating, and optionally purifying the hydrogenase polypeptide. Methods for isolating and optionally purifying hydrogenase polypeptides described herein are routine and known in the art.

Also provided herein are methods for using a hydrogenase polypeptide described herein. The methods can include providing a hydrogenase polypeptide, and incubating the hydrogenase polypeptide under conditions suitable for producing desirable products such as H₂ or NADPH. Optionally, the product is collected using methods routine and known in the art.

In one aspect, the hydrogenase polypeptide used in the methods is cell-free, for instance, it is isolated, or optionally purified. Conditions suitable for incubating an isolated hydrogenase polypeptide may generally include aqueous conditions containing a suitable buffer, such as, but not limited to, EPPS (4-(2-hydroxyethyl)piperazine-1-propanesulfonic acid) at a concentration of 50 mM and buffered near neutral pH (typically 7.5-8.5). The hydrogenase polypeptide may be incubated in an organic solvent, such as, but not limited to, toluene, xylene, benzene, methylene chloride, chloroform, or tetrahydrofuran. A hydrogenase polypeptide that is incubated in an organic solvent is typically chemically modified, preferably with a hydrophobic group, as described herein. The incubation conditions are typically anaerobic, and the temperature may be at least 60° C., at least 70° C., at least 80° C., or at least 90° C. The methods can be performed in any convenient manner. Thus, the reaction steps may be performed in a single reaction vessel. The process may be performed as a batch process or as a continuous process, with desired product and waste products being removed continuously and new raw materials being introduced.

Methods for using an isolated hydrogenase polypeptide include the use of such a polypeptide bound to a surface. In some aspects the surface can be one that conducts electricity, such as an anode. Hydrogenase polypeptides bound to surfaces are useful for applications such as, but not limited to, fuel cells (Armstrong, U.S. Published Patent Application 20040214053).

Methods for using an isolated hydrogenase polypeptide include production of desirable products, such as molecular hydrogen, using renewable resources. For instance, biomass derived polysaccharides can be used as a substrate for the production of monomeric carbohydrates that could then be used as a source of NADPH, which in turn can be used by a hydrogenase polypeptide disclosed herein to produce hydrogen. Examples of such methods include in vitro hydrogen production as taught by Woodward et al. (1996. Nat Biotechnol 14:872-4), and Zhang et al. (2007. PLoS ONE 2:e456, and U.S. Published Patent Application 20070264534). Examples of useful polysaccharides include, but are not limited to, starch and cellulose. Renewable sources of these polysaccharides are known in the art.

In another aspect, a hydrogenase polypeptide used in the methods is present in a microbial cell. The methods can include incubating the microbial cell under conditions suitable for the expression of the polypeptide. The microbial cell is typically a genetically modified microbe, and may be a bacterial cell, such as a gram negative, for instance, E. coli, a photosynthetic organism, for instance, R. sphaeroides, or it can be an archaeal cell, for instance, a member of the genera Pyrococcus, for instance P. furiosus, P. abyssi, or P. horikoshii, or a member of the genera Thermococcus, for instance, T. kodakaraensis or T. onnurineus. The microbe may include exogenous polynucleotides encoding the accessory polypeptides described herein. In those aspects where the microbe is a bacterial cell, such as E. coli, the microbe typically includes exogenous polynucleotides encoding the accessory polypeptides. The incubation conditions are typically anaerobic, and the temperature may be at least 37° C., at least 60° C., at least 70° C., at least 80° C., or at least 90° C. The conditions used to incubate the microbial cell typically include substrates that can be used by a cell to produce a reactant, such as NADPH, or the reductant such as NADPH can be photoproduced by a photosynthetic cell, and the NADPH can be used by the hydrogenase polypeptide to produce molecular hydrogen. Examples of useful substrates include renewable resources containing polysaccharides such as starch, cellulose, or the combination. Alternatively, the conditions used to incubate the microbial cell can include H₂, which can be used by the hydrogenase polypeptide to convert NADP to NADPH. The methods can be performed using any convenient manner. For instance, methods for growing microbial cells to high densities are routine and known in the art, and include batch and continuous fermentation processes.

The present invention is illustrated by the following examples. It is to be understood that the particular examples, materials, amounts, and procedures are to be interpreted broadly in accordance with the scope and spirit of the invention as set forth herein.

Example 1 Anaerobic Expression Vectors

A series of compatible vectors has been constructed with the various promoters described above. The expression vectors described here are derivatives of those described in Horanyi et al., (U.S. Published Patent Application 20060183193). These are a series of four vectors with compatible origins of replication and different antibiotic resistance markers which allow coexpression of multiple genes in E. coli using the lac operon regulation. These vectors have been modified to include the “anaerobic” promoters described above (Table 2) and up to 12 genes derived from P. furiosus. These are a) the structural genes for the four subunits of P. furiosus hydrogenase (Table 1) and b) the eight genes that encode the hydrogenase processing genes in P. furiosus (Table 1). The complete list of vectors created is found in Table 3, and four particular examples are shown in FIGS. 1-4. The complete map and sequences of these four vectors are shown in FIG. 8.

TABLE 1 Pyrococcus furiosus genes encoding structural and accessory proteins for cytoplasmic hydrogenase I and Genbank accession numbers. Coding region or deduced polypeptide sequence encoded by SEQ ID PF gene Gene Genbank coding NO identifier Accession# region 1 PF0891 Structural gene, AE010204.1 coding hydrogenase I region beta subunit 2 PF0891 Structural gene, AAL81015 Polypeptide hydrogenase I encoded by beta subunit coding region 3 PF0892 Structural gene, AE010204.1 coding hydrogenase I region gamma subunit 4 PF0892 Structural gene, AAL81016 Polypeptide hydrogenase I encoded by gamma subunit coding region 5 PF0893 Structural gene, AE010204.1 coding hydrogenase I region delta subunit 6 PF0893 Structural gene, AAL81017 Polypeptide hydrogenase I encoded by delta subunit coding region 7 PF0894 Structural gene, AE010204.1 coding hydrogenase I region alpha subunit 8 PF0894 Structural gene, AAL81018 Polypeptide hydrogenase I encoded by alpha subunit coding region 9 PF0548 HypC AE010177.1 coding region 10 PF0548 HypC AAL80672 Polypeptide encoded by coding region 11 PF0549 HypD AE010177.1 coding region 12 PF0549 HypD AAL80673 Polypeptide encoded by coding region 13 PF0559 HypF AE010178.1 coding region 14 PF0559 HypF AAL80683 Polypeptide encoded by coding region 15 PF0604 HypE AE010182.1 coding region 16 PF0604 HypE AAL80728 Polypeptide encoded by coding region 17 PF0615 HypA AE010183.1 coding region 18 PF0615 HypA AAL80739 Polypeptide encoded by coding region 19 PF0616 HypB AE010183.1 coding region 20 PF0616 HypB AAL80740 Polypeptide encoded by coding region 21 PF0617 HycI AE010183.1 coding region 22 PF0617 HycI AAL80741 Polypeptide encoded by coding region 23 PF1401 SlyD AE010243.1 coding region 24 PF1401 S1yD AAL81525 Polypeptide encoded by coding region

TABLE 2 Escherichia coli hydrogenase promoter DNA sequences derived from the K12 strain genome (accession number NC_000913), and Bacillus megaterium xylose promoter DNA sequences (derived from accession number X57598) (Qaziet al. 2001. Microb Ecol 41:301-309). Genome SEQ ID Gene Genbank  nucleotide DNA NO identifier Accession# start and stop Sequence 25 E. coli K12 hya NC_000913.2 1031062-1031364 CTCGAATTCCTTCTCTTTTACTCGTTTAGCAAC promoter CGGCTAAACATCCCCACCGCCCGGCCAAAAGAA AAATAGGTCCATTTTTATCGCTAAAAGATAAAT CCACACAGTTTGTATTGTTTTGTGCAAAAGTTT CACTACGCTTTATTAACAATACTTTCTGGCGAC GTGCGCCAGTGCAGAAGGATGAGCTTTCGTTTT CAGCATCTCACGTGAAGCGATGGTTTGCCTTGC TACAGGGACGTCGCTTGCCGACCATAAGCGCCC GGTGTCCTGCCGGTGTCGCAAGGAGGAGAGACG TGCGAT ATG GGTCATCACCATCATCACCACGGC TCGATCACAAGTTTGTACAAAAAAGCAGGCTCA GAAAACCTGTATTTTCAGGGAGGA(PFU GENE)* 26 E.coli K12 hyc NC_0009112 2848966-2848355 CTCGAATTCTGCAGCATGTCACCATGACACTGTGG promoter ACAGCGGCGGACGCGCTGGGTCAGTAGCGTCACAT ACTGTTGGCATGTTTCACACCAGCATTCGGCCTCT TGTTCTTCGAGGTGCAGTTTACAACCTTCCGCCAC GCTGCCGCGGCAAACCAGATCAAAACAAAAGGCAA GAGAGCTGGTTTCGACACAAGAAAATGCGCCAATT TTGAGCCAGACCCCAGTTACGCGTTTTGCGCCGTG TTTTGCGGCCTGCTGTTCGATCAATTCCAGTGCCC GTTGGCAGAGGGTTATTTCGTGCATATCGCCTCCC ATTAACTATTGCCAGCTACAAGCAATAATTGTGCC AGTGTTGATTATCCCTGCGGTGAATAATGTCGATG ATGTCGAAATGACACGTCGACACGGCGACGAAATT CATCTTTAGCTTAAAAATCTCTTTAATAACAATAA ATTAAAAGTTGGCACAAAAAATGCTTAAAGCTGGC ATCTCTGTTAAACGGGTAACCTGACAATGACTATT TGGGAAATAAGCGAGAAAGCCGATTACATCGCACA GCGGCATCGTCGCCTACAGGACCAGTGGCACATCT ACTGCAATTCGCTGGTTCAGGGGAGAGGAGGAATA AAAAATG 27 B. megaterium xylA X57598 GAATTCTAGAATCTAATATTATAACTAAATTTTCT promoter AAAAAAAACATTGGAATAGACATTTATTTTGTATA TGATGAAATAAAGTTAGTTTATTGGATAAACAAAC TAACTTTATTAAGGTAGTTGATGGATAAACTTGTT CACTTAAATCAACCCGGGAACAAGGAGGAATAAAA AATG 28 E. coli pRIL section GGATCCCCGTCACCCTGGATGCTGTACAATTGACG ACGACAAGGGCCCGGGCAAACTAGTAATCAGACGC GGTCGTTCACTTGTTCAGCAACCAGATCAAAAGCC ATTGACTCAGCAAGGGTTGACCGTATAATTCACGC GATTACACCGCATTGCGGTATCAACGCGCCCTTAG CTCAGTTGGATAGAGCAACGACCTTCTAAGTCGTG GGCCGCAGGTTCGAATCCTGCAGGGCGCGCCATTA CAATTCAATCAGTTACGCCTTCTTTATATCCTCCA GCCATGGCCTTGAAATGGCGTTAGTCATGAAATAT AGACCGCCATCGAGTACCCCTTGTACCCTTAACTC TTCCTGATACGTAAATAATGATTTGGTGGCCCTTG CTGGACTTGAACCAGCGACCAAGCGATTATGAGTC GCCTGCTCTAACCACTGAGCTAAAGGGCCTTGAGT GTGCAATAACAATACTTATAAACCACGCAATAAAC ATGATGATCTAGAGAATCCCGTCGTAGCCACCATC TTTTTTTGCGGGAGTGGCGAAATTGGTAGACGCAC CAGATTTAGGTTCTGGCGCCGCTAGGTGTGCGAGT TCAAGTCTCGCCTCCCGCACCATTCACCAGAAAGC GTTGATCGGATGCCCTCGAGTCGGGCAGCGTTGGG TCCTGGCCACGGGTGCGCATGATCGTGCTCCTGTC GTTGAGGACCCGGCTAGGCTGGCGGGGTTGCCTTA CTGGTTAGCAGAATGAATCACCGATACGCGAGCGA ACGTGAAGCGACTGCTGCTGCAAAACGTCTGCGAC CTGAGCTC * TheE. coli hya promoter, including the ATG protein translation initiation site is indicated in boldface in the table. The region immediately after includes ggt (encoding a Glycine)/catcaccatcatcaccac(6x His tag) / ggctcgatcacaagttt gtacaaaaaagcaggctca (Gateway attB1 site, encoding GSITSLYKKAGS)/gaaaacct gtattttcaggga (encoding TEV protease recognition site: ENLYFQG, TEV protease cut between Q and G)/gga, encoding another Glycine (SEQ ID NO: 50). At the asterisk, P. furiosus genes are cloned without a start codon to create a fusion protein MGHHHHHHGSITSLYKKAGSENLYFQGG-Pfu target gene (MGHHHHHHGSITSLYKKAGSENLYFQGG, SEQ ID NO: 51).

TABLE 3 Complete list of vectors constructed. Plasmids Constructed plasmid promoter gene Antibiotics pHA-BC hya 0894-hybC Amp pHA-CS hya 0894-CS Amp pET-CAG Gateway plasmid, with promoter P-hya , Ampicillin resistant, pET-CXG Gateway plasmid, with promoter P-xylA, Ampicillin resistant, pEA-SH1 hya 0891-0894 Amp pDEST-C11 T7 promoter, Gateway plasmid, from pDEST-C1, Streptomycin resistant pDEST- hya, Gateway plasmid, from pDEST-C1, Streptomycin resistant C11A pDEST- hya PF0615- Sm C11A- 0617 hypABI pC11A- hya PF0548- Sm CDABI 0549-0615- 0616-0617 pDEST-C3A Gateway plasmid with P-hya promoter in front of Gateway cassette, Chloramphenicol resistant pDEST-C3X Gateway plasmid with P-xylA promoter in front of Gateway cassette, Chloramphenicol resistant pDEST-C3- T7 PF0891- Cm SH1 0894 pDEST- hya PF0891- Cm C3A-SH1 0894 pDEST- hya lacZ Cm C3A-lacZ pDEST- P-xylA lacZ Cm C3X-lacZ pDEST- C3AR derivative of plasmid pDEST-C3A, in Which RIL fragment inserted pC3A-slyD hya PF1401 Cm pC3AR-slyD hya PF1401 Cm pRSF-CAG Gateway plasmid, sequencing confirmed, Kanamycin resistant, done by JS pRSF-CXG pRA-hypE hya PF0604 Kan pRA-hypF hya PF0559 Kan pRA-EF hya PF0604- Kan 0559 pDONR/zeo- PF0617 Zeo hycl pDONR/zeo- PF0548- Zeo hypCD-ABI 0549/0615- 0617 pDONR/zeo- PF0604/0559 Zeo hypEF pDON R/zeo- PF1401 Zeo slyD pDONR/zeo- E. coli lacZ N- Zeo lacZ terminal sequence pDONR/zeo- PF0548- Zeo hypCD 0549 pDONR/zeo- PF0604 Zeo hypE pDONR/zeo- PF0559 Zeo hypF Amp, ampicillin resistance marker; Sm, streptomycin/spectinomycin resistance marker; Cm, chloramphenicol resistance marker; Kan, kanamycin resistance marker; Zeo, zeocin resistance marker.

TABLE 4 Compatible anaerobic expression vectors utilized to express functional P. litriosus cytoplasmic hydrogenase¹ in E. coli. Antibiotic P. furiosus Parent Resistance P. furiosus gene Vector Vector marker gene products number⁶ pC11A- pDEST-C1² Strepto-mycin^(R) HypCDAB PF0548, CDABI⁶ HycI PF0549, PF0615- 0617 pC3AR-slyD¹ pDEST-C3³ Chloram- SlyD PF1401 phenicol^(R) pEA-SH1 pET23(+)⁴ Ampicillin^(R) Hydrogenase I PF0891- PF0894 pRA-EF⁷ pRSFDuet-1⁵ Kanamycin^(R) HypEF PF0604 PF0559 ¹Also includes the region (SEQ ID NO: 28, see Table 2) of the Stratagene (La Jolla, CA) helper plasmid pRIL BL21-CodonPlus+10 +200 (DE3)-RIL competent cells, catalog number 230245. This strain carries the pRIL plasmid which expresses transfer RNAs that are rare in E. coli. ²Horanyi et al., (U.S. patent application 20060183193) ³Horanyi et al., (U.S. patent application 20060183193) ⁴EMD Chemicals Inc., Catalog Number 69771-3. ⁵EMD Chemicals Inc., Catalog Number 71341. ⁶An artificial intergenic sequence was introduced between the hypD and hypA coding regions to create a Shine-Dalgarno ribosome binding site for hypA. CD-ABI intergenic sequence: gaggtggaaa (SEQ ID NO: 52), there was an artificial Shine-dalgarno sequence ( aggaggtg ) in front of hypA gene. hypD+3 s expression stops at TAG, while hypA starts with ATG: (hypD-tttacaaatatggcgccctgatgt aggaggtg gaaaATGcacgaatgggcgttggcagatgcaatagtaagg-hypA) (tttacaaatatggcgccctgatgt aggaggtg gaaaATGcacgaatgggcgttggcagatgcaatagtaagg, SEQ ID NO: 53). ⁷An artificial intergenic sequence was introduced between the hypE and hypF coding regions to create a Shine-Dalgarno ribosome binding site for hypF. The hypE-hypF intergenic sequence is still gaggtggaaa (SEQ ID NO: 52), there was an same artificial Shine- dalgarno sequence ( aggaggtg ) in front of hypF gene. hypE+3 s expression stops at tag, while hypF starts with ATG: hypE-gtgatcccgttcctagagtttgtt ag gaggtggaaaATGatctgggggagagaatgaaagcttatagaattcacg-hypF (gtgatcccgttcctagagtttgttaggaggtggaaaATGatctgggggagagaatgaaagatatagaattcacg; SEQ ID NO: 54).

In addition, one of the vectors, pC3AR-slyD (Table 3) has been further modified to include a region (SEQ ID NO: 28) of the Stratagene (La Jolla, Calif.) helper plasmid pRIL. This plasmid was purified from E. coli BL21-CodonPlus cells from Stratagene (La Jolla, Calif. catalog #230240). This overexpresses transfer RNAs that are rare in E. coli but are required for efficient expression of P. furiosus proteins due to differences in codon usage between the two organisms. This eliminates the need for yet another vector (containing pRIL) and yet another antibiotic resistance marker. The following sequence was amplified from pRIL by PCR, and inserted into pDEST-C3A to create destination plasmid pC3A-RIL, which was used to make expression plasmid pC3AR-slyD (ggatccccgtcaccctggatgctgtacaattgacgacgacaagggcccgggcaaactagtaatcagac gcggtcgttcacttgacagcaaccagatcaaaagccattgactcagcaagggagaccgtataattcacg cgattacaccgcattgcggtatcaacgcgccatagctcagttggatagagcaacgaccactaagtcgtg ggccgcaggttcgaatcctgcagggcgcgccattacaattcaatcagttacgccttctttatatcctccagc catggccttgaaatggcgttagtcatgaaatatagaccgccatcgagtaccccttgtaccataactcttcct gatacgtaaataatgatttggtggccatgctggacttgaaccagcgaccaagcgattatgagtcgcctgc tctaaccactgagctaaagggccttgagtgtgcaataacaatacttataaaccacgcaataaacatgatga tctagagaatcccgtcgtagccaccatctttnttgcgggagtggcgaaattggtagacgcaccagatttag gttctggcgccgctaggtgtgcgagttcaagtctcgcctcccgcaccattcaccagaaagcgttgatcgg atgccctcgagtcgggcagcgttgggtcctggccacgggtgcgcatgatcgtgacctgtcgttgagga cccggctaggctggcggggttgccttactggttagcagaatgaatcaccgatacgcgagcgaacgtgaa gcgactgctgctgcaaaacgtctgcgacctgagctc; SEQ ID NO:55). If all four vectors are used, there are seven possible cloning sites available, four Gateway™ recombination sites (Invitrogen, Carlsbad, Calif.) under control of four different anaerobic promoters, and three standard multiple cloning sites (under standard T7 promoter control), as these are derived from the Novagen Duet system vectors (EMD Chemicals, San Diego, Calif.), with the exception of pEA-SHI, which was derived from pET23, also from Novagen but not part of the Duet system of vectors. However, as many as five consecutive genes can be cloned in tandem under control of the P-hya promoter (plasmid pC11A-CDABI), and all were expressed as demonstrated by quantitative PCR, as described below. This means as many as twenty genes can potentially be coexpressed anaerobically using these compatible vectors and potentially more. Herein we used all four vectors to express 12 genes from P. furiosus. In each construct, a single gene, or the first gene (at the 5′ end) of any group of genes had a poly His-tag which is cleavable with TEV protease.

Example 2 Growth of Recombinant E. Coli and Production of Recombinant P. Furiosus Hydrogenase

The E. coli strain used for expression of the P. furiosus hydrogenase was MW1001, a derivative of the strain BW25113. This strain has the genotype (hyaB hybC hycE Δkan; defective in LSU of hydrogenases 1, 2, and 3, no antibiotic marker)m and lacks detectable E. coli hydrogenase activity (Maeda et al. 2007. BMC Biotechnol 7:25).

To obtain the recombinant form of P. furiosus cytoplasmic hydrogenase I, recombinant E. coli cells containing the four vectors (Table 4) were grown on an 8 L scale at 37° C. in 2×YT media (16 g Tryptone, 10 g Yeast Extract, 5 g NaCl) supplemented with 25 μM NiCl₂, 100 μM FeCl₃, 2 mM MgSO₄ and the antibiotics Ampicillin (50 μg/ml), Chloramphenicol (16.5, μg/ml), Streptomycin (25 μg/ml) and Kanamycin (25 μg/ml). Cloning the complete. P. furiosus SHI operon in E. coli resulted in low efficiency of transformation; however, all techniques used for cloning and transformations were standard molecular biology techniques as described (Sambrook et al., J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.), and transformants were obtained. The culture was sparged with sterile, compressed air (3-5 L/min) until an OD₆₀₀ of ˜0.3 was reached. At this time compressed air was turned off and the cells were sparged with sterile argon (˜4 L/min) and 2% glucose and 30 mM sodium formate were added to supplement growth and induce hydrogenase-related genes in E. coli. The culture was allowed to ferment for five hours and the cells were then quickly harvested by centrifugation and frozen at −80° C. Frozen cells were then thawed and lysed at 25° C. in anaerobic 50 mM Tris buffer pH 8.0, 2 mM sodium dithionite, 0.5 mg/mL lysozyme, 50 μg/mL DNase at a ratio of 1 g/3 mL in an anaerobic chamber under an atmosphere of 5% hydrogen/95% argon overnight.

A hydrogen evolution assay was used to measure hydrogenase activity using an artificial (methyl viologen) electron carrier with sodium dithionite as the electron donor as described (Ma and Adams. 2001. Methods Enzymol 331:208-16). Briefly, this was carried out using 5 mL stoppered vials containing 2 mL of anaerobic 100 mM EPPS buffer pH 8.4, 10 mM sodium dithionite, and 1 mM Methyl Viologen under an atmosphere of argon. Vials were preheated at 80° C. for 1 min and then 200 μL of sample was injected. Samples (100 μL) of the headspace of the sealed vial were removed with a gas-tight syringe and injected into a gas chromatograph after the reaction had proceeded for 6 min. The resulting hydrogen peak was compared to a known standard curve to calculate micromoles of hydrogen produced per mL of assay solution. Specific activity is defined as micromoles H₂ produced min⁻¹ mg protein⁻¹. After cell lysis the following samples were analyzed for hydrogen evolution at 80° C.: Whole cell extracts (WCEs), the cytoplasmic extract after a 100,000×g centrifugation (S100), and heat-treated (at 80° C. for 30 min) and re-centrifuged S100. The data are summarized in Table 5.

TABLE 5 MV-linked H₂-evolving activity of recombinant P. furiosus cytoplasmic hydrogenase I. Total Specific Total Specific Step Units Activity Units Activity BW25113¹ MW1001² WCE 891 2.7 ND⁴ ND⁴ S100 2 0.02 ND⁴ ND⁴ 80° C. ND⁴ ND⁴ ND⁴ ND⁴ treated S100 MW1001 + SHI⁵ MW1001 + SHI + Pf Plasmids⁶ WCE ND⁴ ND⁴ 2.9 0.008 S100 ND⁴ ND⁴ 3.8 0.04 80° C. ND⁴ ND⁴ 4.9 0.31 treated S100 ¹Obtained from T.K. Wood, Texas A&M University, College Station, TX. ²See reference (Maeda et al. 2007. Appl Microbiol Biotechnol 76: 1035-1042). ³Specific activity is defined as μmol H₂ produced min⁻¹ mg protein⁻¹. ⁴Not detected (below detection limit of 0.017 Units (measured with 0.5 mg protein after 2 minutes). ⁵Contains one plasmid expressing the four structural genes that encode P. furiosus hydrogenase: pEA-SH1 (PF0891-0894). ⁶Contains all four plasmids expressing P. furiosus hydrogenase genes including structural and processing genes: pEA-SH1 (PF0891-0894), pC11A-CDABI (PF0548-0549, PF0615-0617), pRA-EF (PF0604, PF0559), pC3AR-slyD (PF1401).

The data clearly demonstrate H₂ evolution from cells expressing the genes encoding P. furiosus hydrogenase, with no detectable H₂ produced by the control strain lacking any gene from P. furiosus. The form of the P. furiosus enzyme responsible for this activity was not only stable at 80° C. for 30 min, but it was activated by this heat treatment, a step that also precipitates heat-labile E. coli proteins. This increase was unexpected and, at 28%, significant. Production of protein corresponding to the catalytic subunit of hydrogenase I (encoded by PF0894) has been confirmed by immunoanalyis (FIG. 5). In addition, expression of the P. furiosus genes in E. coli using these constructs at the level of mRNA has been confirmed by quantitative PCR (FIG. 6). In comparison to the natively purified P. furiosus hydrogenase, FIG. 9 demonstrates that the MV-linked H₂ evolution activity was virtually identical. The expression of coding regions PF0891-0894 resulted in a his-tag present at the amino terminal end of the polypeptide encoded by PF0891, the beta subunit. This tag did not result in a hydrogenase polypeptide that could be affinity purified; however, the hydrogenase polypeptide was active, suggesting the hydrogenase polypeptide is permissive for mutations.

We have therefore demonstrated that heterologous gene expression of the hydrogenase was achieved in E. coli. This was shown by analysis of cell-extracts for mRNA (by PCR) and for protein (by western blot) and that this gene expression leads to the production of a functional recombinant hydrogenase that is catalytically active at 80° C. (by hydrogen production measurements) and is also heat stable at 80° C. (for at least 30 min).

Example 3 Production of Hydrogenase by E. Coli

The ability of E. coli containing the four compatible vectors, termed strain MW/rSHI-C, to produce the recombinant hydrogenase was investigated throughout the growth phase (FIG. 10). The strain was grown on an 8-liter scale in carboys in 2×YT growth media (16 g tryptone, 10 g yeast extract and 5 g NaCl per liter) supplemented with 1% glucose, 2 mM MgSO4, Amp (50 μg/ml), Cm (16 μg/ml), Sm (25 μg/ml) and Kan (25 μg/mL), see Table 4. FIG. 10 summarizes the results from two separate cultures (one indicated by circles, one by triangles). At an OD₆₀₀ of 0.2-0.3, 100 μM FeCl3 and 25 μM NiSO4 were added, the culture was then sealed and allowed to ferment anaerobically (indicated by the arrow in FIG. 10). The growth curves are shown by solid symbols. Samples of the culture were taken every hour after the anaerobic switch. The cells were harvested by centrifugation, lysed, and analyzed for MV linked hydrogenase activity at 80° C. (shown by open symbols). The results show that hydrogenase activity is not detected in E. coli MW/rSHI-C until the cells are switched to anaerobic growth, which is expected since expression of the P. furiosus genes is induced by the so-called anaerobic hya promoter. FIG. 10 also shows that the amount of 80° C. hydrogenase activity, and thus production of the recombinant hydrogenase, increases with cell growth until late stationary phase.

Cell yields of recombinant E. coli MW/rSHI-C approached 1 gram (wet weight)/liter when grown on the 8-liter scale in carboys. We also demonstrated that the same strain could be grown to extremely high cell densities under anaerobic conditions and under such conditions produced the recombinant hydrogenase, as measured by hydrogenase activity at 80° C. Cells were grown in a 5-liter controlled fermentation system (New Brunswick) on same medium that was used in the carboys but with controlled a) pH (6.5), b) dissolved oxygen, and c) glucose concentration. As shown in FIG. 11, cells were grown to an OD₆₀₀ of 38 before switching to anaerobic conditions, in this case by replacing the air with Argon, and this induced the production of the recombinant hydrogenase activity to approximately the same level as in the 8-liter carboy cultures (˜0.1 unit/mg before heat treatment). The cell yield in this case was ˜40 gram (wet weight)/liter.

Example 4 Purification of Hydrogenase

A method for purifying the recombinant hydrogenase was developed that enabled confirmation of the production of the recombinant forms of all four of the protein subunits of P. furiosus hydrogenase. The scheme is summarized in FIG. 12, and involves two standard column chromatography steps using DEAE-Sepharose and Phenyl Sepharose (GE Healthcare). In brief, the E. coli cells (154 gram, wet weight) were broken by thawing them in 3 mL of anaerobic 50 mM Tris, pH 8.0 (3 mL per gram of frozen cells) containing 0.5 mg/mL lysozyme, 50 μg/mL DNase, 1 mM phenylmethylsulfonyl fluoride, and 2 mM sodium dithionite. The suspension was incubated at room temperature in an anaerobic chamber under an atmosphere of 5% H₂/95% Ar for 4 hours to allow the cells to break. The sample was then sealed in an anaerobic flask and heat-treated at 80° C. for 30 min by immersion of the flask in a hot water bath. Samples were then anaerobically centrifuged at 100,000×g for 30 min. The supernatant (650 mls) was then diluted 5-fold with Buffer A (50 mM Tris, 2 mM sodium dithionite, pH 8.0) at a sample/Buffer A ratio and loaded onto a column of DEAE Sepharose (300 ml; GE Healthcare) equilibrated in Buffer A. The column was then washed with 5 column volumes of Buffer A and eluted with a 20-column volume gradient from 0 to 25% gradient of Buffer B (Buffer A+2M NaCl) in 40 ml fractions. Those that contained hydrogenase activity in the standard assay (at 80° C. using reduced methyl viologen as the electron donor) were combined and Buffer A containing 2.0 M ammonium sulfate (NH₄)₂SO₄ was added to a final concentration of 0.8 M. The sample was then loaded on to a column of Phenyl Sepharose (45 ml) equilibrated in Buffer C (Buffer A containing 0.8M (NH₄)₂SO₄). The column was washed with 5-column volumes of Buffer C and eluted with a 20 column volume gradient from 100% Buffer C to 100% Buffer A in 10 ml fractions. Those containing hydrogenase activity were combined.

Typical results of this two-column purification are shown in Table 6. The enzyme was purified almost 60-fold, about 20% of the total activity was recovered with a specific activity in the standard 80° C. assay of 6 units/mg. SDS gel analysis of the hydrogenase active fractions obtained at the different purification steps is shown in FIG. 13. The most purified fractions (the PS Pool from the Phenyl Sepharose column) contain six or so major bands on SDS gels. Analysis of the bands that migrated at the expected molecular weights for the four subunits of the recombinant hydrogenase (see FIG. 11) by standard tryptic digestion/mass spectrometry (MALDI) confirmed unambiguously that those were the four subunits of the P. furiosus hydrogenase enzyme.

TABLE 6 Isolation of recombinant hydrogenase. Total Total Units^(a) Protein Specific % Fold Step (μmol min-1) (mg) Activity Yield Purification Cell Lysate 1349 13059 0.1 100 1 S100 (after 1380 1231 1 102 11 80° C./30 min) DEAE 640 301 2 47 21 Sepharose Phenyl 239 41 6 18 56 Sepharose ^(a)Hydrogenase activity was measured at 80° C. using reduced MV as the electron donor. One unit of activity is equivalent to the production of 1 μmole of hydrogen per minute.

Example 5 Purification of Hydrogenase

A method to obtain highly purified preparations of the hydrogenase that are near homogeneous was devised. This involves two subsequent steps of conventional column chromatography. In brief, the PS Pool (see Table 6) was concentrated by ultrafiltration (Amicon, PM-30 membrane), and applied to a column of Sepharcryl S-200 (GE Healthcare) equilibrated with Buffer A. The same buffer was used to elute the column. Fractions that contained hydrogenase activity in the standard assay were combined and applied directly to a column of Hydroxyapatite (Life Science Research, Hercules, Calif.) equilibrated in Buffer A. The column was washed with 5 column volumes of Buffer A and eluted with a 20-column volume gradient from 0 to 50% gradient of Buffer D (Buffer A+0.5 M potassium phosphate). Samples containing hydrogenase activity were combined. As shown in FIG. 14, the fractions from the Hydroxyapatite column contain highly purified hydrogenase containing four major proteins. These corresponded to the protein bands found in the native hydrogenase purified from P. furiosus. The four protein bands in the purified recombinant hydrogenase were unambiguously shown by tryptic digest/MADI analysis to correspond to the four subunits of the recombinant form of P. furiosus hydrogenase. In addition, the hydrogenase activity from the Sephacryl S-200 column eluted a single band with a molecular weight of approximately 150,000, showing that it was a homogeneous species whose size corresponds to that of the native enzyme, which consists of a heterotetramer of four different polypeptides (see FIG. 14).

Example 6 Metal Analysis

The purified recombinant hydrogenase has hydrogen-evolving activity and must therefore contain a nickel-iron catalytic site. This is demonstrated by a metal analysis of the fractions eluting from the Phenyl Sepharose column using the technique of ICP-MS (Model 7500ce, Agilent Technologies). As shown in FIG. 15, fractions that contained hydrogenase activity also contained both nickel and iron. Moreover, the Fe:Ni ratio was approximately 20, which is almost identical to the value (Fe:Ni=19) proposed to be in the native P. furiosus enzyme (see proposed cofactor content in FIG. 14). Therefore, the recombinant hydrogenase has the expected metal content, consistent with a fully functional enzyme.

FIG. 15 shows a major additional peak of nickel that is not associated with the enzyme. We propose that this nickel is not inserted into the hydrogenase protein because of a limiting growth factor for hydrogenase biosynthesis in E. coli, but that this would occur when E. coli is grown under the appropriate conditions. As an example, nickel may not be processed completely due to the availability of the cyanide and carbon monoxide ligands that are coordinated to the nickel-iron catalytic site. Others have shown that carbamoyl phosphate is the source of the cyanide (Paschos et al. 2001. FEBS Lett 488:9-12). E. coli cells deficient in carbamoyl phosphate (CP) synthesis (by lesion the carAB locus) lose the ability to synthesize active hydrogenase enzymes (Blokesch and Bock. 2002. Journal of Molecular Biology 324:287-296). It was shown that the ΔcarAB strain contained a stable HypC-HypD complex but that processing of hydrogenase does not occur. The complex disappeared and processing and hydrogenase production was restored when a source of CP (L-citrulline) was added to the E. coli growth media. It is anticipated that the addition of this or similar sources of key nutrients will dramatically increase the yield of active recombinant P. furiosus hydrogenase produced in E. coli.

Example 7 Temperature and Oxygen Sensitivity and Electron Donor Specificity of Recombinant Hydrogenase

Purified recombinant hydrogenase is as stable to incubation at high temperature (90° C.) and as sensitive to oxygen as the native form of the enzyme purifed from P. furiosus native biomass. For example, as shown in FIG. 16, the thermal stability of purified recombinant hydrogenase (7.5 mg/ml) and the native hydrogenase (0.4 mg/ml) were analyzed by incubating samples anaerobically under Argon in 100 mM EPPS buffer, pH 8.4, containing 2 mM sodium dithionite in a sealed 8-ml serum vials in a 90° C. water bath. Samples were analyzed for 80° C. MV linked hydrogen evolution activity periodically during the incubation. Both enzyme preparations showed an initial activation to over 150% of the initial activity, as originally reported with the native enzyme (Bryant and Adams, 1989. 1989. J Biol Chem 264:5070-5079). Moreover, the recombinant enzyme continued to exhibit an activity above 150% of the initial value even after 11 hours at 90° C., while that of native enzyme decreased (FIG. 16). However, such stability is dependent upon the protein concentration and increases as the concentration increases. Given the 37-fold higher protein concentration of the recombinant enzyme, it can be concluded that the stabilities of the two forms are comparable.

FIG. 17 shows the results of incubating the purified recombinant hydrogenase (7.5 mg/ml) and the native hydrogenase (0.4 mg/ml) in 100 mM EPPS buffer, pH 8.4, in 8-ml serum vials at room temperature that were exposed at zero time to 20% oxygen (air). The sensitivities of the two forms to oxygen, a property that is not dependent upon protein concentration, was virtually identical.

The recombinant hydrogenase, like the native enzyme, is also able to use NADPH as an electron donor for hydrogen production at 80° C. As shown in Table 7, the two forms exhibit between 3 and 12% of the activity with MV as the electron donor when it is replaced by NADPH (1 mM) under the same assay conditions. The activity, oxygen and thermal stability data, summarized in Table 7, indicate that the structural and catalytic integrity of the recombinant hydrogenase is comparable to that of the native enzyme.

TABLE 7 Subunit Structure and Electron Donor Specificity of Native and Recombinant Forms of Hydrogenase Stability Stability MV- NADPH- at in Linked linked Ratio 90° C. Air (t_(1/2), Enzyme Type (units/mg) (units/mg) (%) (t_(1/2), hr) hour) Native hydrogenase 109 12.7 12 7 >12 (from P. furiosus biomass) Recombinant 5.7 0.15  3 >12 6 Hydrogenase (αβγδ)^(a) Dimeric 0.4 0 — ~1 ~1 Recombinant Hydrogenase (αδ)^(b) Activities were measured using either 1 mM MV or 1 mM NADPH as the electron donor at 80° C. The stability values for the native and recombinant (αβγδ) enzymes are estimates from FIG. 17. The data used to estimate the values for the dimeric form (αδ) is not shown. ^(a)The form of the tetrameric recombinant hydrogenase (αβγδ) used in this experiment was obtained after two chromatography steps (see Table 6). ^(b)The form of the dimeric recombinant hydrogenase (αδ) used in this experiment was after the cell-free extract was clarified by centrifugation (the S-100 fraction). The dimeric form of the hydrogenase is described below.

Example 8 Production of a Dimeric Hydrogenase

The ability to generate the recombinant form of the hydrogenase opens up a complete spectrum of possibilities to produce mutant forms with very different properties from that of the native form. For example, FIG. 18 shows the proposed electron pathway from NADPH through the four subunits of the enzyme and the electron-carrying cofactors (FAD and then multiple [2Fe-2S] and [4Fe-4S] clusters) to the NiFe catalytic site, which catalyzes hydrogen (H₂) production. It is assumed that the artificial electron carrier, MV, can donate electrons directly to one or more of the [2Fe-2S] and [4Fe-4S] clusters directly, by-passing the FAD, see FIG. 18. Consequently, the native heterotetrameric (αβγδ) enzyme produced from 4 genes (PF0891-PF0894) evolves hydrogen from both MV and NADPH (Table 7). However, as shown in FIG. 19, a heterodimeric (αδ) enzyme produced by expression of only PF0893 and PF0894 would lack the proposed NADPH-interacting and FAD-containing γ-subunit (PF0892). This dimeric form would not be expected to evolve hydrogen from NADPH, but may from MV (FIG. 19).

To test this idea and to generate the first mutant form of recombinant P. furiosus hydrogenase, a plasmid, pEA-0893-0894, was constructed that contained only two of the four hydrogenase subunits encoded by PF0893 and PF0894 (FIG. 20). This was based on the plasmid that contains the four genes that encode all four subunits (pEA-SH1, FIG. 8); however, the P-hya promoter in this plasmid did not include the sequences encoding a his-tag. The dimeric (αδ) recombinant enzyme was produced in E. coli strain MW1001 under the same anaerobic expression conditions that were used to produce the recombinant heterotetrameric (αβγδ) enzyme (see FIG. 10) except that pEA-SH1 plasmid was replaced by the pEA-0893-0894 plasmid and that the culture was grown in a 1-liter flask rather than an 8-liter carboy. The recombinant cells (1.5 grams wet weight) were harvested by centrifugation and were lysed by resuspending them in 3 mls (per gram wet weight of cells) of anaerobic 50 mM Tris, pH 8.0, containing 0.5 mg/mL lysozyme, 50 ug/mL DNase, 1 mM phenylmethylsulfonyl fluoride, and 2 mM sodium dithionite. Samples were lysed by incubation at room temperature in an anaerobic chamber under an atmosphere of 5% H₂/95% Ar for 4 hours. The protein content of the cell-free extract was 8.9 mg/mL as determined by the standard protein assay and 5.2 units of hydrogenase activity measured using MV as the electron donor at 80° C. The specific activity was 0.078 U/mg, which is comparable to that obtained with the tetrameric (αβγδ) recombinant enzyme (Table 6). However, as indicated in Table 7, the dimeric (αδ) recombinant form had no detectable hydrogen production activity using NADPH (1 mM) as the electron donor, as was predicted (FIG. 19). Also, the structural as well as the catalytic integrity of the recombinant dimeric hydrogenase differed from that of both the recombinant and native forms of tetrameric holoenzyme. As shown in Table 7, the dimeric form was much more sensitive to oxygen and was much less stable at 90° C. However, the fact that this mutated form of the enzyme containing only two subunits still had an approximate half-life at 90° C. of 1 hour shows the great advantage of using a hyperthermophilic enzyme as the starting material for any manipulation of enzyme structure. The resulting protein was expected to be considerably less stable than its native counterpart, but the extreme stability of the native means that an ‘unstable’ form can still retain remarkably stability and activity, relative to conventional enzymes found in organisms growing at conventional temperatures. Moreover, with the demonstration here of an extremely stable dimeric mutant form with catalytic properties, the means to generate a wide variety of mutant forms, for example, with various tags for purification and immobilization, is now possible.

In summary, a series of four compatible vectors have been constructed that will express a functional hydrogenase in E. coli. It was shown that recombinant hydrogenase was produced when cells were switched to anaerobic growth and that the amount of the enzyme produced increased with cell growth until late stationary phase. Recombinant hydrogenase was also produced in recombinant E. coli cells grown to exceedingly high densities (OD˜40). A method for purifying the recombinant hydrogenase to a high level of purity is described, and analysis of the protein components of the recombinant enzyme by a standard mass spectrometry technique established unambiguously that it contained the four hydrogenase subunits encoded by the four cloned genes that were heterologously expressed. It was also demonstrated that the recombinant enzyme has approximately the same molecular weight (˜150 kDa) and metal content (20 Fe:1 Ni) as the native enzyme purified from P. furiosus biomass, it is similarly stable to high temperature (half life at 90° C. of ˜12 hr) and sensitive to inactivation by oxygen (half life of ˜6 hr in air) and, like the native enzyme, uses NADPH as an electron donor for hydrogen production at 80° C. The ability to generate mutant or modified forms of the hydrogenase was demonstrated by the production of a heterodimer form containing two subunits rather than the four subunits of the heterotetrameric enzyme. The dimeric form was still catalyitically active at 80° C. with the artificial electron donor MV, but it did not use NADPH as an electron donor. The dimeric form was still very thermostable (half-life at 90° C. of ˜1 hr). This demonstrates the great advantage of using a hyperthermophilic enzyme as the starting material for any manipulation of enzyme structure.

The complete disclosure of all patents, patent applications, and publications, and electronically available material (including, for instance, nucleotide sequence submissions in, e.g., GenBank and RefSeq, and amino acid sequence submissions in, e.g., SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq) cited herein are incorporated by reference. In the event that any inconsistency exists between the disclosure of the present application and the disclosure(s) of any document incorporated herein by reference, the disclosure of the present application shall govern. The foregoing detailed description and examples have been given for clarity of understanding only. No unnecessary limitations are to be understood therefrom. The invention is not limited to the exact details shown and described, for variations obvious to one skilled in the art will be included within the invention defined by the claims.

Unless otherwise indicated, all numbers expressing quantities of components, molecular weights, and so forth used in the specification and claims are to be understood as being modified in all instances by the term “about.” Accordingly, unless otherwise indicated to the contrary, the numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by the present invention. At the very least, and not as an attempt to limit the doctrine of equivalents to the scope of the claims, each numerical parameter should at least be construed in light of the number of reported significant digits and by applying ordinary rounding techniques.

Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the invention are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. All numerical values, however, inherently contain a range necessarily resulting from the standard deviation found in their respective testing measurements.

All headings are for the convenience of the reader and should not be used to limit the meaning of the text that follows the heading, unless so specified. 

1. (canceled)
 2. A tetrameric polypeptide comprising four subunits, wherein the amino acid sequence of the first subunit and the amino acid sequence of SEQ ID NO:6 have at least 80% identity, wherein the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:8 have at least 80% identity, wherein the amino acid sequence of the third subunit and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, wherein the amino acid sequence of the fourth subunit and the amino acid sequence of SEQ ID NO:4 have at least 80% identity, wherein the tetrameric polypeptide has hydrogenase activity, and wherein the tetrameric polypeptide is present in a genetically modified microbial cell.
 3. A tetrameric polypeptide comprising four subunits, wherein the amino acid sequence of the first subunit and the amino acid sequence of SEQ ID NO:6 have at least 80% identity, wherein the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:8 have at least 80% identity, wherein the amino acid sequence of the third subunit and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, wherein the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:4 have at least 80% identity, wherein the tetrameric polypeptide has hydrogenase activity, and wherein one at least one subunit of the tetrameric polypeptide is a fusion comprising a heterologous amino acid sequence.
 4. The tetrameric polypeptide of claim 3 wherein the tetrameric polypeptide is isolated.
 5. (canceled)
 6. The tetrameric polypeptide of claim 3 wherein the tetrameric polypeptide is present in a microbial cell.
 7. (canceled)
 8. The tetrameric polypeptide of claim 3 wherein the hydrogenase activity is at least 0.05 micromoles H₂ produced min⁻¹ mg protein⁻¹ when isolated by whole cell extract, centrifugation of a whole cell extract at 100,000×g, heat-treatment at 80° C. for 30 minutes, and re-centrifugation at 100,000×g. 9-14. (canceled)
 15. The polypeptide of claim 3 wherein the polypeptide consists of the first subunit, the second subunit, the third subunit, and the fourth subunit.
 16. (canceled)
 17. A genetically modified microbe comprising an exogenous polypeptide, wherein the exogenous polypeptide comprises four subunits, wherein the first subunit comprises an amino acid sequence, and the amino acid sequence of the first subunit and the amino acid sequence of SEQ ID NO:6 have at least 80% identity, wherein the second subunit comprises an amino acid sequence, and the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:8 have at least 80% identity, wherein the third subunit comprises an amino acid sequence, and the amino acid sequence of the third subunit and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, wherein the fourth subunit comprises an amino acid sequence, and the amino acid sequence of the fourth subunit and the amino acid sequence of SEQ ID NO:4 have at least 80% identity., and wherein the four subunits form a dimeric tetrameric polypeptide having hydrogenase activity.
 18. (canceled)
 19. The genetically modified microbe of claim 17 wherein one at least one subunit is a fusion comprising a heterologous amino acid sequence. 20-27. (canceled)
 28. A method for using a polypeptide comprising: providing a) a tetrameric polypeptide comprising four subunits, wherein the amino acid sequence of the first subunit and the amino acid sequence of SEQ ID NO:6 have at least 80% identity, wherein the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:8 have at least 80% identity, wherein the amino acid sequence of the third subunit and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, wherein the amino acid sequence of the fourth subunit and the amino acid sequence of SEQ ID NO:4 have at least 80% identity, wherein the tetrameric polypeptide has hydrogenase activity, and wherein the tetrameric polypeptide is present in a genetically modified microbial cell, b) a tetrameric polypeptide comprising four subunits, wherein the amino acid sequence of the first subunit and the amino acid sequence of SEQ ID NO:6 have at least 80% identity, wherein the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:8 have at least 80% identity, wherein the amino acid sequence of the third subunit and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, wherein the amino acid sequence of the fourth subunit and the amino acid sequence of SEQ ID NO:4 have at least 80% identity, wherein the tetrameric polypeptide has hydrogenase activity, and wherein one at least one subunit of the tetrameric polypeptide is a fusion comprising a heterologous amino acid sequence, or c) a tetrameric polypeptide consisting of four subunits, wherein the amino acid sequence of the first subunit and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, wherein the amino acid sequence of the second subunit and the amino acid sequence of SEQ ID NO:4 have at least 80% identity, wherein the amino acid sequence of the third subunit and the amino acid sequence of SEQ ID NO:6 have at least 80% identity, wherein the amino acid sequence of the fourth subunit and the amino acid sequence of SEQ ID NO:8 have at least 80% identity, wherein one at least one subunit of the tetrameric polypeptide is a fusion comprising a heterologous amino acid sequence, and wherein the tetrameric polypeptide has hydrogenase activity; and incubating the polypeptide under conditions suitable for producing H₂ or NADPH.
 29. The method of claim 28 further comprising collecting the produced H₂ or the produced NADPH.
 30. The method of claim 28 wherein the tetrameric polypeptide of (a) or (b) is an isolated polypeptide.
 31. The method of claim 30 wherein the tetrameric polypeptide is present on a surface.
 32. The method of claim 31 wherein the surface conducts electricity.
 33. The method of claim 32 wherein the surface is an anode.
 34. The method of claim 28 wherein the tetrameric polypeptide of (a) or (b) is chemically modified.
 35. The method of claim 28 wherein the incubating comprises conditions that comprise a polysaccharide. (Original) The method of claim 35 wherein the polysaccharide comprises starch.
 37. The method of claim 28 wherein the conditions comprise a temperature of at least 70° C. 38-49. (canceled) 